Binomial confidence intervals for proportions (Jeffreys prior)
cij [varlist] [weight] [if exp] [in range] [, level(#) total ]
ciji #obs #succ [, level(#) ]
by ... : may be used with cij (but not with ciji); see help by.
fweights are allowed with cij; see help weights.
cij computes standard errors and binomial confidence intervals for each variable in varlist, which should be 0/1 binomial variables. ciji is the immediate form of cij, for which specify the number of observations and the number of successes. See help immed for more on immediate commands. With both commands confidence intervals are calculated based on the Jeffreys uninformative prior of a beta distribution with parameters 0.5 and 0.5.
Suppose we observe n events and record k successes. Here as usual "success" is conventional terminology for whatever is coded 1. For a 95% confidence interval, for example, we then take the 0.025 and 0.975 quantiles of the beta distribution with parameters k + 0.5 and n - k + 0.5. This Bayesian procedure has a frequentist interpretation as a continuity-corrected version of the so-called exact (Clopper-Pearson) confidence interval, produced by ci, binomial, which takes (in the same example) the 0.025 quantile of beta(k, n - k + 1) and the 0.975 quantile of beta(k + 1, n - k). The lower limit if all values are 0 is taken to be 0 and the upper limit if all values are 1 is taken to be 1. Among other properties, note that this interval is typically less conservative than the exact interval, so that coverage probabilities are on average close to the nominal confidence level. From a Bayesian point of view, however, the whole of the posterior distribution is much more fundamental than any interval derived from it.
See Brown et al. (2001) for a much fuller discussion and an entry to the literature. Brown et al. (2002) provide supporting technical background to that paper. Among many references, Agresti (2002, pp.14-21), Agresti and Coull (1998), Newcombe (1998, 2001) and Vollset (1993) provide clear and helpful context. Williams (2001, Ch.6) provides a lively alternative treatment of confidence intervals for one-parameter models. The original work on uninformative priors was by Harold Jeffreys (1946; 1948, Ch.3.9; 1961, Ch.3.10). The actuary Wilfred Perks (1947) independently produced very similar ideas. Later discussions include Good (1965, esp. pp.18-19), Rubin and Schenker (1987), Lee (1989, esp. p.93; 1997, esp. pp.88-89), Gelman et al. (1995, esp. pp.55-56), or Carlin and Louis (1996, esp. pp.50-54; 2000, esp. pp.42-46). For more on Jeffreys (1891-1989), see Cook (1990), Lindley (2001) or Lindley et al. (1991).
The method of calculating beta quantiles used here is based on the fact that if Y is distributed as beta(a,b) and X is distributed as F(2a,2b), then Y = aX / (b + aX). See (e.g.) Cramér (1946, pp.241-4) for background or Lee (1989, p.251; 1997, p.291). In Stata 8, it can be done directly with invibeta().
level(#) specifies the confidence level, in percent, for confidence intervals; see help level.
total is for use with the by ... : prefix; it requests that, in addition to output for each by-group, output be added for all groups combined.
. cij foreign
. ciji 10 1 (10 binomial events, 1 observed success)
Nicholas J. Cox, University of Durham, U.K. email@example.com
Alan Feiveson suggested the Cramér reference. John R. Gleason increased my interest in this problem.
Agresti, A. 2002. Categorical data analysis. Hoboken, NJ: John Wiley.
Agresti, A. and Coull, B.A. 1998. Approximate is better than "exact" for interval estimation of binomial proportions. American Statistician 52: 119-126.
Brown, L.D., Cai, T.T., DasGupta, A. 2001. Interval estimation for a binomial proportion. Statistical Science 16: 101-133.
Brown, L.D., Cai, T.T., DasGupta, A. 2002. Confidence intervals for a binomial proportion and asymptotic expansions. Annals of Statistics 30: 160-201.
Carlin, B.P. and Louis, T.A. 1996/2000. Bayes and empirical Bayes methods for data analysis. Boca Raton, FL: Chapman and Hall/CRC (1996: London: Chapman and Hall.)
Cook, A.H. 1990. Sir Harold Jeffreys. Biographical Memoirs of Fellows of the Royal Society 36: 303-333.
Cramér, H. 1946. Mathematical methods of statistics. Princeton, NJ: Princeton University Press.
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. 1995. Bayesian data analysis. London: Chapman and Hall.
Good, I.J. 1965. The estimation of probabilities: an essay on modern Bayesian methods. Cambridge, MA: MIT Press.
Jeffreys, H. 1939/1948/1961. Theory of probability. Oxford: Oxford University Press.
Jeffreys, H. 1946. An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society A 186: 453-461. Reprinted in Jeffreys, H. and Jeffreys, B.S. (eds) 1977. Collected papers of Sir Harold Jeffreys on geophysics and other sciences. Volume 6: Mathematics, probability and miscellaneous other sciences. London: Gordon and Breach, 403-411.
Lee, P.M. 1989/1997. Bayesian statistics: an introduction. London: Edward Arnold.
Lindley, D.V. 2001. Harold Jeffreys. In Heyde, C.C. and Seneta, E. (eds) Statisticians of the centuries. New York: Springer, 402-405.
Lindley, D.V., Bolt, B.A., Huzurbazar, V.S., Jeffreys, B.S., Knopoff, L. 1991. [articles on Harold Jeffreys] Chance 4(2): 10-26.
Newcombe, R.G. 1998. Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine 17: 857-872.
Newcombe, R.G. 2001. Logit confidence intervals and the inverse sinh transformation. American Statistician 55: 200-202.
Perks, W. 1947. Some observations on inverse probability including a new indifference rule. Journal, Institute of Actuaries 73: 285-334.
Rubin, D.M. and Schenker, N. 1987. Logit-based interval estimation for binomial data using the Jeffreys prior. Sociological Methodology 17: 131-144.
Vollset, S.E. 1993. Confidence intervals for a binomial proportion. Statistics in Medicine 12: 809-824.
Williams, D. 2001. Weighing the odds: a course in probability and statistics. Cambridge: Cambridge University Press.
Manual: [R] ci On-line: help for ci, bitest, immed