.-
help for ^relogitq^
.-

Calculates quantities of interest after a corrected logit regression
--------------------------------------------------------------------

   ^relogitq^ [^, pr^ ^bayes^ ^mle^ ^unbi^ased ^listx^ 
               ^fd(pr)^ ^changex(^var1 val1 val2 [^&^ var2 val1 val2]^)^
               ^rr(^var1 val1 val2 [^&^ var2 val1 val2]^) sims(^#^) l^evel^(^#^)^]


Description
-----------

This procedure implements the suggestions of King and Zeng (1999a,b)
for improved methods of computing quantities of interest -- absolute
risks (probabilities), relative risks, and attributable risks (first
differences) -- from a logistic regression that is corrected for small
samples and rare events, as well as selection on the dependent
variable as in case-control designs.  First run a corrected logit (see
help @relogit@) and set values for the explanatory variables (see help
@setx@).  Then use ^relogitq^ to calculate the desired quantities of
interest.

Note: The ^relogitq^ procedure is memory-intensive.  If you receive an error
message such as "no room to add more variables," you may need to allocate
more memory to Stata by typing "clear" and then "set memory #m".  See
[R] memory in the reference manual for more details about memory allocation.


Options That Affect Which Quantities are Calculated
---------------------------------------------------

^pr^ reports Pr(depvar==1|x), the probability (absolute risk) that the
   dependent variable takes on a value of 1 when the explanatory
   variables (x) are set at values that were chosen at the @setx@
   stage.  If no other options are specified, this is the default
   output.

^fd(pr)^ is a "wrapper" that makes it easy to simulate first differences
   (also called attributable risks).  Simply wrap the fd() wrapper around
   the ^pr^ option to estimate the change in Pr(Y=1) given some change in x,
   holding other variables at the values that were set at the @setx@
   stage.  The ^fd()^ wrapper must be used in conjunction with the
   ^changex()^ option.

^changex(^var1 val1 val2^)^ specifies how the explanatory variables
   (x) should change when evaluating a first difference (attributable
   risk). ^changex()^ uses the same basic syntax as @setx@, except
   that each explanatory variable has two values: a starting value and
   an ending value.  For instance, ^fd(pr)^ ^changex(x1 .2 .8)^
   calculates the change in Pr(Y=1) caused by increasing x1 from its
   starting value, 0.2, to its ending value, 0.8.  You can specify
   multiple changex scenarios by separating each scenario with an
   ampersand.  See the examples, below.

^rr(^var1 val1 val2^)^ specifies how the explanatory variables (x)
   should change when calculating the relative risk,
   Pr(Y=1|xend)/Pr(Y=1|xstart), where xstart represents the vector of
   starting values for x and xend represents the vector of ending
   values for x.  ^rr()^ uses the same basic syntax as ^changex^.  For
   instance, ^rr(x1 mean p75)^ instructs ^relogitq^ to calculate the
   relative risk of Pr(Y=1) caused by increasing x1 from its mean to
   its 75th percentile, holding other variables at the levels chosen
   at the @setx@ stage.  In this example, x1 is set to its mean in
   xstart and set to its 75th percentile in xend.  If you are
   interested in the percentage change in relative risk, compute
   100*[rr - 1], where rr is the output from this command.  You can
   specify multiple rr() scenarios by separating each scenario with an
   ampersand.

^listx^ causes ^relogitq^ to list all x-values that were chosen at the @setx@
   stage and provide a basis for predicted probabilities, first differences
   and relative risks.

^l^evel^(^#^)^ specifies the confidence level, in percent, for confidence     
   intervals around quantities of interest.  The default is ^level(95)^ or
   the value set by ^set l^evel.  For more information on ^set l^evel, see
   the on-line help for @level@.

   
Options that Affect How the Quantities are Calculated
-----------------------------------------------------

By default, ^relogitq^ uses stochastic simulation to compute all
quantities of interest and the uncertainty surrounding those
quantities.  The program reports the median of the simulated posterior
density, as well as confidence intervals around the median.
^relogitq^ also supports analytical methods for obtaining point
estimates of quantities of interest, but continues to use simulation
to measure the uncertainty.  The following three analytical methods
are available:

^mle^ instructs ^relogitq^ to calculate point estimates for quantities of
   interest based only on the the maximum likelihood estimates (the
   coefficients generated by @relogit@), without accounting for their
   uncertainty.  For instance, mle option computes the probability
   Pr(Y=1|x,b) using the formula 1/(1+exp(-x*b)), where x is the
   vector of x's that was chosen at the @setx@ stage and b represents
   the vector of logit (or relogit) coefficients.  This approach is
   consistent but has higher mean square error and so is not generally
   recommended.

^unbi^ased instructs ^relogitq^ to calculate approximately unbiased estimates
   of all quantities of interest.  This option has a higher mean squared error
   than the Bayesian alternative, which is superior in most cases.

^bayes^ uses the entire probability distribution of b to approximate the
   expected value of Pr(Y=1|x), without conditioning on the point
   estimate b.  This approach has the lowest mean squared error and is
   recommended for users who prefer the analytical approach.

The program also contains an option to control the simulation process, which
produces all measures of uncertainty.

^sims(^m^)^ specifies the number of simulations, m, which must be a positive
   integer.  The default is 1000 simulations.  Increase the number of
   simulations to obtain more precise approximations to quantities of
   interest; reduce the number of simulations for greater
   computational speed.  You can determine whether you have enough
   precision by repeating a relogitq command with the same number of
   simulations and seeing whether you have sufficient digits of
   precision.  If you choose a large number of simulations, you
   may need to allocate more memory to Stata.  See [R] memory in the
   reference manual for more details about memory allocation.


Examples
--------

To display Pr(Y=1|x), where x represents the values that were chosen
at the @setx@ stage, type

   . ^relogitq^

To obtain the same quantity of interest via analytical Bayesian
methods and list all x-values chosen at the @setx@ stage, type type

   . ^relogitq, bayes listx^

Use the ^fd()^ and ^changex()^ options to calculate the effects of
changes in probabilties caused by movements in x.  For instance, the
following command will calculate the change in Pr(Y=1) caused by
increasing the explanatory variable x1 from its 20th to its 80th
percentile.

   . ^relogitq, fd(pr) changex(x1 p20 p80)^


You specify many changex() scenarios by separating each scenario with
an ampersand.  The following expression will calculate two first
differences (attributable risks): the change in Pr(Y=1) caused by
increasing x1 from its mean to its maximum level, and the change in
Pr(Y=1) caused by simultaneously incrasing x1 from 3 to the square
root of 15 and increasing x2 from its median to its 90th percentile.

   . ^relogitq, fd(pr) changex(x1 mean max & x1 3 sqrt(15) x2 median p90)^


A similar syntax applies to relative risks.  Thus, the next command
gives the percentage change in relative risk of Pr(Y=1) caused by
raising x1 from 10 to 15.

   . ^relogitq, rr(x1 10 15)^



Saved Results
-------------

^relogitq^ saves the following scalars:

  r(Pr)     = Point estimate for Pr(Y=1|x), where x was set with @setx@
  r{PrL)    = Lower bound of confidence interval for Pr(Y=1|x)
  r(PrU)    = Upper bound of confidence interval for Pr(Y=1|x)
  r(dPr_#)  = Point estimate of change in Pr(Y=1) for 1st difference scenario #
  r(dPrL_#) = Lower bound of confidence interval for first difference #
  r(dPrU_#) = Upper bound of confidence interval for first difference #
  r(rr_#)   = Point estimate of % change in Pr(Y=1) for relative risk scenario #
  r(rrL_#)  = Lower bound of confidence interval for relative risk #
  r(rrU_#)  = Upper bound of confidence interval for relative risk #


Distribution
------------

    ^relogitq^ is (C) Copyright, 1999, Michael Tomz, Gary King and Langche
    Zeng, All Rights Reserved.  You may copy and distribute this program
    provided no charge is made and the copy is identical to the original.
    To request an exception, please contact:

    Michael Tomz <tomz@@fas.harvard.edu>
    Department of Government, Harvard University
    Littauer Center North Yard
    Cambridge, MA 02138

    Please distribute the current version of this program, which is
    available from http://GKing.Harvard.Edu.


References
----------

    Gary King and Langche Zeng. 1999a. "Logistic Regression in Rare
    Events Data," Department of Government, Harvard University,
    available from http://GKing.Harvard.Edu.

    Gary King and Langche Zeng. 1999b. "Estimating Absolute, Relative,
    and Attributable Risks in Case-Control Studies," Department of
    Government, Harvard University, available from
    http://GKing.Harvard.Edu.