.-
help for ^simqi^ Version 2.1
.-
Simulates quantities of interest
--------------------------------
^simqi^ [^, pv genpv(^newvar1 newvar2...^)^
^ev^ ^genev(^newvar1 newvar2...^)^
^pr^ ^prval(^value1 value2...^)^ ^genpr(^newvar1 newvar2...^)^
^fd(^existing option^)^ ^changex(^var1 val1 val2 [^&^ var2 val1 val2] ^)^
^msims(^#^)^ ^tfunc(^function^)^ ^l^evel^(^#^)^ ^listx^ ]
Description
-----------
After simulating parameters from the last estimation (see help @estsimp@) and
setting values for the explanatory variables (see help @setx@), use ^simqi^ to
simulate various quantities of interest, including predicted values, expected
values, and first-differences.
Predicted values contain two forms of uncertainty: "fundamental" uncertainty
arising from sheer randomness in the world, and "estimation" uncertainty
caused by not having an infinite number of observations. More technically,
predicted values are random draws of the dependent variable from the
stochastic component of the statistical model, given a random draw from the
posterior distribution of the unknown parameters.
If there were no estimation uncertainty, the expected value would be a
single number representing the mean of the distribution of predicted values.
But estimates are never certain, so the the expected value must be a
distribution rather than a point. To obtain this distribution, we
average-away the fundamental variability, leaving only estimation uncertainty.
For this reason, expected values have a smaller variance than predicted
values, even though the point estimate should be roughly the same in both
cases. ^simqi^ calculates two kinds of expected values: the expected value
of Y and the probability that Y takes on a particular value. For models
in which these two quantities are equal, ^simqi^ avoids redundancy by
reporting only the probabilities.
Note: simulated expected values are equivalent to simulated probabilities for
all the discrete choice models that ^simqi^ supports (logit, probit, ologit,
oprobit, mlogit). In these models, the expected value of Y is a vector, with
each element indicating the probability that Y=j. Consider an ordered probit
with outcomes 1, 2, 3. The expected value is [Pr(Y=1), Pr(Y=2), Pr(Y=3)],
the mean of a multinomial distribution that generates the dependent variable.
A first difference is the difference between two expected values. To
simulate first differences use the fd "wrapper", which is described below.
It is possible to compute many other quantities of interest based on the
output from ^simqi^. For examples of such quantities, see the paper by
King, Tomz and Wittenberg (2000) cited at the end of this help file.
Default Output
--------------
^simqi^ can generate predicted values, expected values and first differences
for all the models that it supports. By default, however, it will only
report the quantities of interest that appear in the table below. To view
other quantities of interest or save the simulated quantities as new
variables that can be analyzed and graphed, use one of ^simqi^'s options.
Statistical Quantities displayed
Model by default
----------- --------------------------
regress E(Y)
logit Pr(Y=1)
probit Pr(Y=1)
ologit Pr(Y=j) for all outcomes j
oprobit Pr(Y=j) for all outcomes j
mlogit Pr(Y=j) for all outcomes j
poisson E(Y)
nbreg E(Y)
sureg E(Y_j) for all equations j
weibull E(Y)
Options
-------
^pv^ displays a summary of the predicted values that ^simqi^ generated via
simulation
^genpv(^newvar1 newvar2...^)^ saves the predicted values as new variables in
the current dataset. For single-equation models, you may specify only one
new variable; each "observation" of that new variable will contain one
simulated predicted value. For multiple-equation models such as @sureg@,
you may specify as many new variables as there are outcome variables in
the model.
^pr^ displays a summary of the probabilities that ^simqi^ generated via
simulation
^prval(^value1 value2 ...^)^ instructs ^simqi^ to evaluate the probability
that the dependent variable takes-on each of the listed values. The
values must appear in ascending order without any duplicates.
^genpr(^newvar1 newvar2 ...^)^ saves the simulated probabilities as new
variables in the current dataset. Each new "observation" represents
one simulated probability. If both the ^prval()^ option and the ^genpr()^
option are used, ^simqi^ will save Pr(Y==value1) in newvar1, Pr(Y==value2)
in newvar2, etc. If the ^prval()^ option is not specified, ^genpr()^ will
save the probabilities in the same ascending order as the outcome
values of the dependent variable.
^ev^ displays a summary of expected values that ^simqi^ generated via
simulation. This option is not available for discrete choice models,
where it is redundant with ^pr^
^genev(^newvar1 newvar2 ... ^)^ saves the expected values in new variables.
For single equation models you may specify only one new variable. Each
observation of newvar will contain one simulated expected value of the
dependent variable. For multiple-equation models such as @sureg@,
you may specify as many new variables as there are outcome variables in
the model. The ^genev()^ option is not available for discrete choice
models, where it is redundant with ^genpr()^
^fd(^existing option^)^ is a "wrapper" that makes it easy to simulate first
differences. Simply wrap the fd() wrapper around an existing option and
specify the changex() option.
^changex(^var1 val1 val2^)^ specifies how the explanatory variables (the x's)
should change when evaluating a first difference. ^changex^ uses the
same basic syntax as @setx@, except that each explanatory variable has
two values: a starting value and an ending value. For instance, ^fd(ev)^
^changex(x1 .2 .8)^ instructs ^simqi^ to simulate a change in the expected
value of Y caused by increasing x1 from its starting value, 0.2, to its
ending value, to 0.8.
^msims(^#^)^ sets the number of simulations to be used when calculating
expected values. The number must be a positive integer. By default,
the value of msims is set at 1000. ^simqi^ disregards the msims option
whenever the expected value is parametrically defined.
^tfunc(^function^)^ allows the user to specify a transformation function for
transforming the dependent variable. This option is only available for
@regress@ and @sureg@. The currently supported functions are
Function Transformation (for all variables j)
----------- ------------------------------------
squared y_j ----> y_j * y_j
sqrt y_j ----> sqrt(y_j)
exp y_j ----> exp(y_j)
ln y_j ----> ln(y_j)
logiti y_j ----> inverselogit(y_j)
The inverse logit function is exp(y_j)/(1+SUM[exp(y_j)]) where the summation
is done over all the j's.
^l^evel^(^#^)^ specifies the confidence level, in percent, for confidence
intervals. The default is ^level(95)^ or the value set by ^set l^evel. For
more information on ^set l^evel, see the on-line help for @level@.
^listx^ instructs ^simqi^ to list the x-values that were used to produce
the quantities of interest. These values were set using the @setx@ command.
Basic Examples
--------------
To display the default quantities of interest for the last estimated model,
type:
. ^simqi^
For a summary of the simulated expected values, type:
. ^simqi, ev^
For a summary of the simulated probabilties, Pr(Y=j), for all j categories
of the dependent variable, type:
. ^simqi, pr^
To display only a summary of Pr(Y=1), the probability that the dependent
variable takes on a value of 1, type:
. ^simqi, prval(1)^
To generate first differences, use the fd() wrapper and the changex() option.
For instance, the following command will simulate the change in the expected
value of Y caused by increasing x4 from 3 to 7, while holding other
explanatory variables at their means
. ^setx mean^
. ^simqi, fd(ev) changex(x4 3 7)^
To simulate the change in the simulated probabilities, Pr(Y=j), for all j
categories of the dependent variable, given an increase in x4 from its
minimum to its mean, type:
. ^setx mean^
. ^simqi, fd(pr) changex(x4 min mean)^
If you are only interested in the change in Pr(Y=1) caused by raising
x4 from its 20th to its 80th percentile when other variables are held at
their mean, type:
. ^setx mean^
. ^simqi, fd(prval(1)) changex(x4 p20 p80)^
More Intricate Examples
-----------------------
To display not only the simulated expected values but also the x-values used
to produce them, we would type:
. ^simqi, ev listx^
-simqi- displays 95% confidence intervals by default, but we could modify
the previous example to give a 90% confidence interval for the expected
value:
. ^simqi, ev listx level(90)^
To save the simulated expected values in a new variable called predval, type:
. ^simqi, genev(predval)^
To simulate Pr(Y=0), Pr(Y=3), and Pr(Y=4), and then save the simulated
probabilities as variables called simpr0, simpr3 and simpr4, type:
. ^simqi, prval(0 3 4) genpr(simpr0 simpr3 simpr4)^
The changex option can be arbitrarily complicated. Suppose that we want
to simulate the change in Pr(Y=1) caused by simultaneously increasing x1
from .2 to .8 and x2 from ln(7) to ln(10). The following lines will
produce the quantities we seek
. ^setx mean^
. ^simqi, fd(prval(1)) changex(x1 .2 .8 x2 ln(7) ln(10))^
We could augment the previous example by requesting a second first difference,
caused by increasing x3 from its median to its 90th percentile. Simply
separate the two changex requests with an ampersand.
. ^setx mean^
. ^simqi, fd(prval(1)) changex(x1 .2 .8 x2 ln(7) ln(10) & x3 median p90)^
Likewise, the fd() option can be as intricate as we would like. For
instance, suppose that we have run a poisson regression. We want to see what
happens to Pr(Y=2), Pr(Y=3), and the expected count when we increase x1 from
its minimum to its maximum. To obtain our quantities of interest, we would
type:
. ^setx mean^
. ^simqi, fd(prval(2 3)) fd(ev) changex(x1 min max)^
-simqi- allows us to save any simulated variable for subsequent analysis.
To find the mean, standard deviation, and a confidence interval around any
quantity of interest that has been saved in memory, use the @sumqi@ command.
To graph the simulations, use @graph@ or @kdensity@.
Distribution
------------
^simqi^ is part of CLARIFY, a suite of Stata programs for interpreting
statistical results, and is (C) Copyright, 1999-2003, Michael Tomz, Jason
Wittenberg and Gary King, All Rights Reserved. You may copy and
distribute this program provided no charge is made and the copy is
identical to the original. To request an exception, please contact:
Michael Tomz
Department of Political Science
Encina Hall, Stanford University
Stanford, CA 94305-6044
We recommend that you distribute the current version of this program,
which is available from http://GKing.Harvard.Edu.
Reference
---------
If you use this program, please cite:
Michael Tomz, Jason Wittenberg, and Gary King. 2003. CLARIFY:
Software for Interpreting and Presenting Statistical Results.
Version 2.1. Stanford University, University of Wisconsin,
and Harvard University. January 5. Available at
http://gking.harvard.edu/
and
Gary King, Michael Tomz, and Jason Wittenberg. 2000. "Making
the Most of Statistical Analyses: Improving Interpretation and
Presentation." American Journal of Political Science 44, no. 2
(April): 347-61.