------------------------------------------------------------------------------- help forpsbayes-------------------------------------------------------------------------------

Pseudo-Bayes smoothing of cell frequencies

psbayesdatavar[priorvar] [ifexp] [inrange] [, by(rowvar[colvar[layervar]])generate(newvar)probtabdisp_options]

Description

psbayestakesdatavar, which should be a set of frequencies, and shrinks or smooths it towards a set of frequencies implied by prior probabilities. This will have the effect of replacing sampling zeros by positive estimates whenever the priors are positive.For a set of data frequencies n_i, summing to n, and a set of prior probabilities q_i, the smoothed estimates are n * p_i, where

n n_i k p_i = ----- --- + ----- q_i, n + k n n + k

and shrinkage is tuned by the constant

n² - SUM ( n_i²) k = --------------------. SUM (n_i - n * q_i)²

These estimates minimise the total mean square error between estimated and estimand probabilities. For more details, see the References.

If

priorvaris specified, it must sum to 1 for the data used. Ifpriorvaris not specified, it is taken to be a set of equal probabilities.

Options

by(rowvar colvar layervar)indicates thatdatavarrefers to a table with rows (and columns if specified (and layers if specified)) indexed by the variable(s) named, which will structure a display of cell estimates using tabdisp. Ifby()is not specified, cell estimates will be displayed according to observation numbers.

generate(newvar)generates a new variable containing results.

probindicates that probabilities rather than estimated frequencies are to be shown (and if desired kept).

tabdisp_optionsare options of tabdisp. Defaultcenter format(%9.1f).

Examples

. psbayes f prior, by(row col) g(sf)

. contract foreign rep78, zero nomiss. psbayes _freq, by(foreign rep78) prob

AuthorNicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk

ReferencesAgresti, A. 2002.

Categorical data analysis.Hoboken, NJ: John Wiley.Bishop, Y.M.M., Fienberg, S.E. and Holland, P.W. 1975.

Discretemultivariate analysis.Cambridge, MA: MIT Press.Fienberg, S.E. and Holland, P.W. 1970. Methods for eliminating zero counts in contingency tables. In Patil, G.P. (ed.)

Random counts inscientific work. Volume 1: Random counts in models and structures.Pennsylvania State University Press, University Park, 233-260.Fienberg, S.E. and Holland, P.W. 1972. On the choice of flattening constants for estimating multinomial probabilities.

Journal ofMultivariate Analysis2: 127-134.Fienberg, S.E. and Holland, P.W. 1973. Simultaneous estimation of multinomial cell probabilities.

Journal, American StatisticalAssociation68: 683-691.Good, I.J. 1965.

The estimation of probabilities: an essay on modernBayesian methods.MIT Press, Cambridge, MA.Sutherland, M., Holland, P.W. and Fienberg, S.E. 1975. Combining Bayes and frequency approaches to estimate a multinomial parameter. In Fienberg, S.E. and Zellner, A. (eds)

Studies in Bayesian econometricsand statistics in honor of Leonard J. Savage.North-Holland, Amsterdam, 585-617.

Also seeOn-line: help for tabdisp