help for oprobpr                                                    Nick Winter

Displaying predicted probabilities from ordered probit & logit

oprobpr xvar [[if exp] [in range] [, levels(level_spec) noesample categories(cat_spec[, ...]) labels(lablist) gtype(plot_type) stack from(#) to(#) inc(#) nwidth(#) nolist save(filename[, replace])} plot(plot) graph_options


oprobpr plots (and optionally lists) the predicted probabilities from a previously-estimated ordered dependent variable model against a single covariate from the model, holding the other covariates constant.

By default all response categories are listed and plotted; the categories() option allows the user to control plotting of only some categories, or combinations of categories. labels() allows control of the labelling of the plotted lines.

oprobpr is a substantially-updated version that takes advantage of Stata's version 8 graphics. With this revision, oprobpr now operates as a post-estimation command, rather than estimating and plotting the model in one step. See the important note below on handling interaction terms in the revised command. This program was originally based on logpred published by Joanne Garrett and of probpred published by Mead Over in sg42.2: STB 42.


levels(level_spec) specifies the levels at which to hold the other covariates in the model. By default, all covariates are set at their estimation sample means (but see the noesample() option). Note that the command automatically calculates these means in the estimation sample only , any if and/or in conditions specified with oprobpr will further restrict the calculation of these means.

The levels() option allows you to set covariates to other values; e.g. levels(mpg=50, foreign=1).

In addition, if the model includes interaction terms between the xvar variable you are plotting and some other covariate, they must be appropriately specified in the levels() option in order that oprobpr can re-calculate them appropriately. For example, if your model included a variable called mXw, which is the interaction between mpg and weight, and mpg is your xvar, then you would specify levels(mXw=mpgXweight). Of course, you can mix and match the two uses of levels(): levels(weight=2500, mXw=mpg*weight).

noesample specifies that the calculation of means of covariates should be done without regard to the estimation sample from the model estimation. This means both that any if and/or in conditions specified when the model was estimated are ignorred, and that all cases are included in the calculation of each variable's mean (that is, cases are not excluded on a casewise basis). Any if and/or in conditions specified with the oprobpr do restrict the calculation of means.

from(#) specifies the lowest value of xvar for which a prediction is to be calculated. The default is to use the minimum of xvar in the estimation sample.

to(#) specifies the highest value of xvar for which a prediction is to be calculated. The default is to use the maximum of xvar in the estimation sample.

newobs(#) specifies the number of observations to be created for calculation and graphing of predicted probabilities. The default is 25; more may be specified if necessary to yield a smooth line.

categories(n,...,n) controls which categories of the dependent variable are plotted and listed. The default is to list and plot probabilities for all categories. For example, cat(1,3,4) would result in categories 1, 3 and 4 only being listed and plotted.

categories() also allows categories to be combined. So, for example, cat(1+2,3,4+5+6) would plot three lines: one that is the sum of probabilities for categories one and two, one that is the probability of category three, and one that is the sum of categories 4 through 6.

labels() specifies text labels with which to label the lines. By default, simple categories are labeled with the appropriate value label from the dependent variable, if available. Otherwise, they are labelled "Category 1", "Category 2" ... , through "Category n". For example, cat(Low Medium High) would label the lines "Low", "Medium", and "High". To leave a line unlabeled, indicate a "." for its label.

Enclose any labels that have spaces with quotation marks: lab("Very Low" Low Medium High "Very High")

Note that the labels can also be specified within the legend() option's label() suboption; this just provides a somewhat easier way to change multiple labels with a single option.

gtype() specifies the type of twoway graph to create. By default this is scatter.

stack causes the categories to be calculated as cumulative probabilities, allowing the creation of a stacked plot; when this is specified the gtype() defaults to area.

nwidth() specifies the width of the note that oprobpr creates to indicate the levels of the covariates. The default is 80.

save(filename) saves the prediction data set. This is useful for conducting additional analysis of the predicted values. (Note that the graph option saving() is different, and may be used to save the resulting graph.)

plot(plot) provides a way to add other plots to the generated graph; see help plot_option. Note that the dataset will be intact, but will have additional observations added at the end for the calculation of predicted probabilities. To exclude these observations from any additional plots, include the condition in @@@ in these plots; this will be translated to cover the range of the original dataset.

graph_options can be any valid options for a twoway graph.


. oprobit rep78 mpg weight gear_ratio foreign . oprobpr mpg

Plots the predicted probabilities for categories of rep78 against mpg, holding the other covariates constant at their estimation-sample means.

. oprobpr mpg, levels(weight=2500, foreign=0)

Same as above, except predictions are for foreign==0 and weight==2500 instead of for the sample averages of those variables.

. generate mpgXweight = mpg*weight . ologit rep78 mpg weight mpgXweight gear_ratio foreign . oprobpr mpg, levels(weight=2500, mpgXweight=mpg*weight, foreign=0)

Same as above, except that the interaction term weight*mpg is included in the original model, and is specified in the oprobpr() command. Weight is held constant at 2500, foreign at 0, and gear_ratio at its sample mean. Model estimated is ordered logit rather than ordered probit.

. oprobpr mpg, cat(1,3+4,5) lab(Low "Medium & High" "Very High") clpattern(dash dot dash_dot) msym(i i i) title( , size(small))

Plots only categories 1, the sum of 3 and 4, and 5 of rep78, and labels them "Low", "Medium & High", and "Very High", respectively. The default line patterns and marker symbols are overridden, and the size of the overall title is adjusted. Note that any other option which works on the graph command will also work here.

. oprobpr mpg, stack

Probabilities are graphed as a stacked area plot instead of as a series of lines.

. oprobpr mpg, plot(function y=.75, range(mpg) ) legend(order(1 2 3 4 5)

An additional plot is overlaid, in this case a horizontal line at 0.75, specified to run over the range of the variable mpg. The legend option is specified to suppress the listing of this additional plot in the legend. Note that this effect could also be achieved with the yline() option.


Nicholas Winter University of Virginia nwinter@virginia.edu

Also see

On-line: help for predict, oprobit, ologit, graph twoway