{smcl}
{* 16aug2004}{...}
{hline}
help for {hi:psbayes}
{hline}

{title:Pseudo-Bayes smoothing of cell frequencies} 

{p 8 17 2} 
{cmd:psbayes}
{it:datavar}
[{it:priorvar}] 
[{cmd:if} {it:exp}] 
[{cmd:in} {it:range}]
[ 
{cmd:, by(}{it:rowvar} [{it:colvar} [{it:layervar}]]{cmd:)}
{cmdab:g:enerate(}{it:newvar}{cmd:)}
{cmdab:p:rob} 
{it:tabdisp_options} 
]


{title:Description}

{p 4 4 2} 
{cmd:psbayes} takes {it:datavar}, which should be a set of frequencies, and
shrinks or smooths it towards a set of frequencies implied by prior
probabilities. This will have the effect of replacing sampling
zeros by positive estimates whenever the priors are positive.

{p 4 4 2} 
For a set of data frequencies n_i, summing to n, and a set of prior
probabilities q_i, the smoothed estimates are n * p_i, where

                 n   n_i       k
        p_i =  {hline 5} {hline 3}  +  {hline 5} q_i,
               n + k  n      n + k

{p 4 4 2} 
and shrinkage is tuned by the constant

              n{c 178}  - SUM ( n_i{c 178})
        k = {hline 20}.
            SUM (n_i - n * q_i){c 178}

{p 4 4 2} 
These estimates minimise the total mean square error between
estimated and estimand probabilities. For more details, see the
References.

{p 4 4 2} 
If {it:priorvar} is specified, it must sum to 1 for the data used. If
{it:priorvar} is not specified, it is taken to be a set of equal
probabilities.


{title:Options} 

{p 4 8 2} 
{cmd:by(}{it:rowvar colvar layervar}{cmd:)} indicates that {it:datavar}
refers to a table with rows (and columns if specified 
(and layers if specified)) indexed by the variable(s) named, which will structure a display of
cell estimates using {help tabdisp}. If {cmd:by()} is not specified, cell
estimates will be displayed according to observation numbers.

{p 4 8 2}
{cmd:generate(}{it:newvar}{cmd:)}
generates a new variable containing results.

{p 4 8 2} 
{cmd:prob} indicates that probabilities rather than estimated frequencies
are to be shown (and if desired kept).

{p 4 8 2} 
{it:tabdisp_options} are options of {help tabdisp}. 
Default {cmd:center format(%9.1f)}.


{title:Examples}

{p 4 8 2}{cmd:. psbayes f prior, by(row col) g(sf)} 

{p 4 8 2}{cmd:. contract foreign rep78, zero nomiss}{p_end}
{p 4 8 2}{cmd:. psbayes _freq, by(foreign rep78) prob}


{title:Author}

{p 4 4 2}Nicholas J. Cox, University of Durham, U.K.{break} 
         n.j.cox@durham.ac.uk


{title:References} 

{p 4 8 2}
Agresti, A. 2002. {it:Categorical data analysis.} Hoboken, NJ: John Wiley.

{p 4 8 2}
Bishop, Y.M.M., Fienberg, S.E. and Holland, P.W. 1975. 
{it:Discrete multivariate analysis.} Cambridge, MA: MIT Press.

{p 4 8 2}
Fienberg, S.E. and Holland, P.W. 1970. Methods for eliminating zero
counts in contingency tables. In Patil, G.P. (ed.) 
{it:Random counts in scientific work. Volume 1: Random counts in models and structures.} 
Pennsylvania State University Press, University Park, 233{c -}260.

{p 4 8 2}
Fienberg, S.E. and Holland, P.W. 1972. On the choice of flattening
constants for estimating multinomial probabilities. 
{it:Journal of Multivariate Analysis} 2: 127{c -}134.

{p 4 8 2}
Fienberg, S.E. and Holland, P.W. 1973. Simultaneous estimation of
multinomial cell probabilities. 
{it:Journal, American Statistical Association} 68: 683{c -}691.

{p 4 8 2}
Good, I.J. 1965. 
{it:The estimation of probabilities: an essay on modern Bayesian methods.}
MIT Press, Cambridge, MA.

{p 4 8 2}
Sutherland, M., Holland, P.W. and Fienberg, S.E. 1975. Combining Bayes
and frequency approaches to estimate a multinomial parameter. In
Fienberg, S.E. and Zellner, A. (eds) 
{it:Studies in Bayesian econometrics and statistics in honor of Leonard J. Savage.}
North-Holland, Amsterdam, 585{c -}617.


{title:Also see}

{p 4 13 2} 
On-line:  help for {help tabdisp}