.-
help for ^psbayes6^
.-

Pseudo-Bayes smoothing of cell estimates ----------------------------------------

^psbayes6^ datavar [priorvar] [^if^ exp] [^in^ range] [ ^, by(^rowvar [colvar [layervar]]^) g^enerate^(^newvar^) p^rob tabdisp_options ]

Description -----------

^psbayes6^ takes datavar, which should be a set of frequencies, and shrinks or smooths it towards a set of frequencies implied by prior probabilities. This will have the effect of replacing sampling zeros by positive estimates whenever the priors are positive.

For a set of data frequencies n_i, summing to n, and a set of prior probabilities q_i, the smoothed estimates are n * p_i, where

n n_i k p_i = ----- --- + ----- q_i, n + k n n + k

and shrinkage is tuned by the constant

2 2 n - sum ( n_i ) k = ----------------------. 2 sum (n_i - n * q_i)

These estimates minimise the total mean square error between estimated and estimand probabilities. For more details, see the References.

If priorvar is specified, it must sum to 1 for the data used. If priorvar is not specified, it is taken to be a set of equal probabilities.

^psbayes6^ is the original version of ^psbayes^, renamed on the promotion of ^psbayes^ to Stata 8. Users of Stata 8 up should change to ^psbayes^.

Options -------

^by(^rowvar colvar layervar^)^ indicates that datavar refers to a table with rows (and columns if specified (and layers if specified)) indexed by the variable(s) named, which will structure a display of cell estimates using ^tabdisp^. If ^by( )^ is not specified, cell estimates will be displayed according to observation numbers.

^generate(^newvar^)^ generates a new variable containing results.

^prob^ indicates that probabilities rather than estimated frequencies are to be shown (and if desired kept).

tabdisp_options are options of ^tabdisp^. Default ^center format(%9.1f)^.

Examples --------

. ^psbayes6 f prior, by(row col) g(sf)^

References ----------

Agresti, A. 1990. Categorical data analysis. New York: John Wiley.

Bishop, Y.M.M., Fienberg, S.E. and Holland, P.W. 1975. Discrete multivariate analysis. Cambridge, MA: MIT Press.

Fienberg, S.E. and Holland, P.W. 1970. Methods for eliminating zero counts in contingency tables. In Patil, G.P. (ed.) Random counts in scientific work. Volume 1: Random counts in models and structures. Pennsylvania State University Press, University Park, 233-260.

Fienberg, S.E. and Holland, P.W. 1972. On the choice of flattening constants for estimating multinomial probabilities. Journal of Multivariate Analysis 2, 127-134.

Fienberg, S.E. and Holland, P.W. 1973. Simultaneous estimation of multinomial cell probabilities. Journal, American Statistical Association 68, 683-691.

Good, I.J. 1965. The estimation of probabilities: an essay on modern Bayesian methods. MIT Press, Cambridge, MA.

Sutherland, M., Holland, P.W. and Fienberg, S.E. 1975. Combining Bayes and frequency approaches to estimate a multinomial parameter. In Fienberg, S.E. and Zellner, A. (eds) Studies in Bayesian econometrics and statistics in honor of Leonard J. Savage. North-Holland, Amsterdam, 585-617.

Author ------

Nicholas J. Cox, University of Durham, U.K. n.j.cox@@durham.ac.uk

Also see --------

On-line: help for @tabdisp@