```-------------------------------------------------------------------------------
help for kapssi                                       (Author:  David Harrison)
-------------------------------------------------------------------------------

Sample size calculations for kappa

Two unique raters, two ratings:

kapssi kappa, { se(#) | diff(#) [level(#)] | n(#) } p1(#) [ p2(#)
round ]

Two or more (non-unique) raters, two ratings:

kapssi kappa, { se(#) | diff(#) [level(#)] | n(#) } p(#) [ m(#) round
]

Description

kapssi estimates required sample size for estimating the kappa-statistic
of inter-rater reliability for a binary outcome (having postulated value
kappa) with given standard error, or the standard error for a given
sample size.  If n() is specified, kapssi computes standard error;
otherwise it computes sample size.  kapssi is an immediate command; all
of its arguments are numbers (see help immed).

For two raters, the results are the same as produced by sskdlg or sskapp
(except for rounding; see round option below), based on the asymptotic
variance presented by Fleiss, Cohen and Everitt (1969).  Results for more
than two raters are based on the asymptotic variance for the
Fleiss-Cuzick estimator of kappa presented by Zou & Donner (2004) in the
case of equal numbers of ratings for each subject.

Options

se(#) specifies the standard error of kappa.

diff(#) specifies the half width of the confidence interval for kappa as
an alternative to the standard error.

level(#) specifies the significance level for the confidence interval;
the default is obtained from set level (see help level), usually
level(95).

n(#) specifies the sample size for which to calculate standard error.

p1(#) specifies the proportion of positive results reported by rater 1
(of two raters).

p2(#) specifies the proportion of positive results reported by rater 2
(of two raters); if p2 is not specified it is assumed to be equal to
p1.

p(#) specifies the overall proportion of positive results (multiple
raters).

m(#) specifies the number of raters; the default is m(2).

round specifies that the sample size is to be rounded to the nearest
integer; the default is to round up using the function ceil(). This
allows reproducability of results for two raters produced by sskdlg
or sskapp which both have this behaviour.

Examples

Two raters.  Compute sample size given standard error:

. kapssi .8, se(.1) p(.1)

Compute sample size given half width of confidence interval:

. kapssi .6, diff(.2) p1(.15) p2(.12) round

This is equivalent to:

. sskapp, p1(.15) p2(.12) diff(.2) kapp(.6)

More than two raters.  Compute sample size:

. kapssi .75, se(.12) p(.05) m(3)

Compute standard error for given sample size:

. kapssi .8, n(100) p(.12) m(4)

References

Fleiss, J. L., Cohen, J. and Everitt, B.S. 1969. Large sample standard
errors of kappa and weighted kappa. Psychological Bulletin 72: 323-327.

Zou, G. and Donner, A. 2004. Confidence interval estimation of the
intraclass correlation coefficient for binary outcome data. Biometrics
60: 807-811.

Maintainer

David A. Harrison
Intensive Care National Audit & Research Centre
david@icnarc.org

Also see

Online:  help for kappa, sskdlg, sskapp, immed
```