```-------------------------------------------------------------------------------
help for regpar                                                  (Roger Newson)
-------------------------------------------------------------------------------

Population attributable and unattributable risks from binary regression models

regpar [if] [in] [weight] , [ atspec(atspec) atzero(atspec0)
subpop(subspec) predict(pred_opt) vce(vcespec) noesample
force iterate(#) level(#) post ]

where atspec and atspec0 are at-specifications recognized by the at()
option of margins, subspec is a subpopulation specification of the form
recognized by the subpop() option of margins, and vcespec is a
variance-covariance specification of the form recognized by margins, and
must have one of the values

delta | unconditional

fweights, aweights, iweights, pweights are allowed; see weight.

Description

regpar calculates confidence intervals for population attributable risks,
and also for scenario proportions.  regpar can be used after an
estimation command whose predicted values are interpreted as conditional
proportions, such as logit, logistic, probit, or glm.  It estimates two
scenario proportions, a baseline scenario ("Scenario 0") and a fantasy
scenario ("Scenario 1"), in which one or more exposure variables are
assumed to be set to particular values (typically zero), and any other
predictor variables in the model are assumed to remain the same.  It also
estimates the difference between the Scenario 0 proportion and the
Scenario 1 proportion.  This difference is known as the population
attributable risk (PAR), and represents the amount of risk attributable
to living in Scenario 0 instead of Scenario 1.

Options for regpar

atspec(atspec) is an at-specification, allowed as a value of the at()
option of margins.  This at-specification must specify a single
scenario ("Scenario 1"), defined as a fantasy world in which a subset
of the predictor variables in the model are set to values different
from their value in the baseline scenario (denoted "Scenario 0" and
equal to the real-life scenario unless atzero() is specified).
regpar uses the margins command to estimate the proportions of
outcome values positive under Scenarios 0 and 1, and then uses nlcom
to estimate the logits of these 2 scenario proportions, and the
hyperbolic arctangent (or Fishers's z-transform) of the difference
between the Scenario 0 proportion and the Scenario 1 proportion,
known as the population attributable risk (PAR).  The proportion
positive under Scenario 1 is known as the population unattributable
risk (PUR).  If atspec() is not specified, then its default value is
atspec((asobserved) _all), implying that Scenario 1 is the real-life
baseline scenario, represented by the predictor values actually
present.

atzero(atspec0) is an at-specification, allowed as a value of the at()
option of margins.  This at-specification must specify a single
baseline scenario ("Scenario 0"), defined as an alternative fantasy
world in which a subset of predictors in the model are set to the
values specified by atspec0.  Scenario 0 will then be compared to the
"Scenario 1" specified by the atspec() option.  If atzero() is not
specified, then its default value is atzero((asobserved) _all),
implying that Scenario 0 is the real-life baseline scenario,
represented by the predictor values actually present.

subpop(subspec), predict(pred_opt) and vce(vcespec) have the same form
and function as the options of the same names for margins.  They
specify the subpopulation, the predict option(s), and the
variance-covariance matrix formula, respectively, used to estimate
the scenario proportions, and therefore to estimate the population
attributable risk.

noesample has the same function as the option of the same name for
margins.  It specifies that computations will not be restricted to
the estimation sample used by the previous estimation command.

force has the same function as the option of the same name for margins.

iterate(#) has the same form and function as the option of the same name
for nlcom.  It specifies the number of iterations used by nlcom to
find the optimal step size to calculate the numerical derivatives of
the logits of the scenario proportions and the z-transform of their
difference, with respect to the original scenario proportions
calculated by margins.

level(#) specifies the percentage confidence level to be used in
calculating the confidence intervals.  If it is not specified, then
it is taken from the current value of the c-class value c(level),
which is usually 95.

post specifies that regpar will post in e() the estimation results for
estimating the logits of the scenario means and the z-transform of
their difference, the PAR.  If post is not specified, then any
existing estimation results are left in e().  Note that the
estimation results posted are for the logits of the scenario
proportions and the z-transform of their difference, and not for the
proportions themselves and their difference.  This is done because
the estimation results are intended to define symmetric confidence
intervals for the transformed parameters, which can be
back-transformed to define asymmetric confidence intervals for the
untransformed parameters.

Remarks

regpar estimates the population attributable risk, defined in its
simplest unadjusted form by Gordis (2000).  The general principles behind
scenario comparisons for generalized linear models were introduced in
Lane and Nelder (1982).

regpar starts by estimating the logits of the scenario proportions and
the hyperbolic arctangent or Fisher's z-transform of their difference
(the PAR), using margins and nlcom.  The results of this estimation are
stored in e(), if the option post is specified.  These estimation results
may be saved in an output dataset (or resultsset) by the parmest package,

regpar assumes that the most recent estimation command estimates the
parameters of a regression model, whose fitted values are conditional
proportions, which must be bounded between 0 and 1.  It is the user's
responsibility to ensure that this is the case.  However, it will be true
if the conditional proportions are defined using a generalized linear
model with a Bernoulli variance function (not a non-Bernoulli binomial
variance function), and a logit, probit or complementary log-log link
function.

Note that population attributable and unattributable risks (PARs and
PURs) are not the same parameters as population attributable and
unattributable fractions (PAFs and PUFs).  A population attributable risk
is a difference between scenario proportions, a population unattributable
fraction is a ratio between scenario proportions (or means), and a
population attributable fraction is the result of subtracting a
population unattributable fraction from 1.  Users who need to estimate
population unattributable fractions should use either punaf (for cohort
or cross-sectional study data) or punafcc (for case-control or survival
study data).  Users who need to estimate scenario prevalences (without
differences) should use margprev.  Users who need to estimate
log-transformed scenario means (without ratios) should use marglmean.
SSC.

Note, also, that regpar estimates population attributable risks (PARs)
using an indirect-standardization method, based on a regression model.
Alternatively, a user can estimate the PAR using a direct-standardization
method, based on rank statistics.  This can be done using the scsomersd,

Examples

The following examples use the dataset lbw.dta, provided by Hosmer and
Lemeshow (1988) and used in [R] logistic and distributed by Stata Press.
This dataset has 1 observation for each of a sample of pregnancies, and
data on the birth weight of the baby and on a list of predictive
variables, which might be assumed to be causal by some scientists.

Setup

.use http://www.stata-press.com/data/r11/lbw.dta, clear
.describe

The following example estimates population unattributable and
attributable risks for maternal smoking during pregnancy as a predictor
of low birth weight.  This is done by comparing "Scenario 1" (a fantasy
world in which no pregnant women smoke) with "Scenario 0" (the real world
in which the data were collected).

. logit low i.race i.smoke, or robust
. regpar, at(smoke=0)

The following example estimates population unattributable and
attributable risks for maternal smoking and non-white race.  This is done
by comparing "Scenario 1" (a fantasy world in which all pregnant women
are white and no pregnant women smoke) with "Scenario 0" (the real world
in which the data were collected).

.logit low i.race i.smoke, or robust
.regpar, at(smoke=0 race=1)

The following example demonstrates the use of regpar with a univariate
model of low birth weight with respect to maternal smoking status, to
estimate total and exposed PARs.  We logit to estimate the odds ratio of
low birth weight with respect to maternal smoking, and use regpar to
estimate the scenario proportions and PARs, first for the total
population, then for the smoking-exposed subpopulation.  Finally, we use
regpar with the atzero() option to compare 2 alternative fantasy
scenarios, a "Scenario 0" in which no mothers smoke and a "Scenario 1" in
which all mothers smoke.  Note that, in this comparison, the PAR is
negative, because a world of non-smoking mothers would have fewer low
birth weight babies than a world of smoking mothers.

.logit low i.smoke, or robust
.regpar, at(smoke=0)
.regpar if smoke==1, at(smoke=0)
.regpar, at(smoke=1) atzero(smoke=0)

The following example demonstrates the use of regpar with the parmest
attributable risks for smoking are estimated using regpar (with the post
option), and saved (in their transformed versions), using parmest, in a
dataset in memory, overwriting the original dataset, with 1 observation
for each of the 3 transformed parameters, named "Scenario_0",
"Scenario_1" and "PAR", and data on the estimates, confidence limits,
P-values, and other parameter attributes.  We then use replace to replace
the symmetric confidence intervals for the transformed parameters with
asymmetric confidence intervals for the untransformed parameters, and
describe and list the new dataset.

. logit low i.race i.smoke, or robust
. regpar, at(smoke=0) post
. parmest, norestore
. foreach Y of var estimate min* max* {
. replace `Y'=invlogit(`Y') if parm!="PAR"
. replace `Y'=tanh(`Y') if parm=="PAR"
. }
. describe
. list

Saved results

regpar saves the following in r():

Scalars
r(rank)             rank of r(V)
r(N)                number of observations
r(N_sub)            subpopulation observations
r(N_clust)          number of clusters
r(N_psu)            number of samples PSUs, survey data only
r(N_strata)         number of strata, survey data only
r(df_r)             variance degrees of freedom, survey data only
r(N_poststrata)     number of post strata, survey data only
r(k_margins)        number of terms in marginlist
r(k_by)             number of subpopulations
r(k_at)             number of at() options (always 2)
r(level)            confidence level of confidence intervals

Macros
r(atzero)           atzero() option
r(atspec)           atspec() option

Matrices
r(cimat)            vector containing estimates and confidence limits
for the scenario proportions and the PAR
r(b)                vector of logits of scenario proportions and their
z-transformed difference
r(V)                estimated variance-covariance matrix of the logits
of scenario proportions and their z-transformed
difference

If post is specified, regpar also saves the following in e():

Scalars
e(rank)             rank of e(V)
e(N)                number of observations
e(N_sub)            subpopulation observations
e(N_clust)          number of clusters
e(N_psu)            number of samples PSUs, survey data only
e(N_strata)         number of strata, survey data only
e(df_r)             variance degrees of freedom, survey data only
e(N_poststrata)     number of post strata, survey data only
e(k_margins)        number of terms in marginlist
e(k_by)             number of subpopulations
e(k_at)             number of at() options (always 2)

Macros
e(cmd)              regpar
e(predict)          program used to implement predict
e(atzero)           atzero() option
e(atspec)           atspec() option
e(properties)       b V

Matrices
e(b)                vector of logits of scenario proportions and their
z-transformed difference
e(V)                estimated variance-covariance matrix of the logits
of scenario proportions and their z-transformed
difference
e(V_srs)            simple-random-sampling-without-replacement
(co)variance hat V_srswor, if svy
e(V_srswr)          simple-random-sampling-with-replacement
(co)variance hat V_srswr, if svy and fpc()
e(V_msp)            misspecification (co)variance hat V_msp, if svy and
available

Functions
e(sample)           marks estimation sample

Author

Roger Newson, National Heart and Lung Institute, Imperial College London,
UK.
Email: r.newson@imperial.ac.uk

References

Gordis, L.  2000.  Epidemiology. 2nd ed.  Philadelphia, PA: W. B.
Saunders.

Hosmer Jr., D. W., S. Lemeshow, and J. Klar.  1988.  Goodness-of-fit
testing for the logistic regression model when the estimated
probabilities are small.  Biometrical Journal 30: 911–924.

Lane, P. W., and J. A. Nelder.  1982.  Analysis of covariance and
standardization as instances of prediction.  Biometrics 38: 613–621.

Also see

Manual:  [R] margins, [R] nlcom, [R] logistic, [R] logit, [R] probit, [R]
glm

Help:  [R] margins, [R] nlcom, [R] logistic, [R] logit, [R] probit, [R]
glm
punaf, punafcc, margprev, marglmean, parmest, scsomersd,
somersd, expgen if installed
```