------------------------------------------------------------------------------- help for circtwosample -------------------------------------------------------------------------------

Two-sample tests for circular data

circtwosample varlist [if exp] [in range] , by(byvar)

Description

circtwosample carries out two-sample tests due to Watson (1962) and Kuiper (1960) for circular variables in varlist with scales between 0 and 360 degrees. The values of each variable are grouped according to the precisely two distinct non-missing values defined by byvar together with any restrictions imposed by if or in. The hypothesis under test is that the distributions so defined are identical. The test statistics are based on the empirical distribution functions of the two samples.

Note that allowing a varlist is a convenience to allow many tests from a single command: the tests are separate and users searching for significant results are urged to consider carefully what they are doing.

Remarks

For the Watson U-square statistic, critical values are given here for a few examples of sample sizes n_1 and n_2 and significance level P. For many problems with large or moderate samples, the values for infinite sizes will serve as adequate approximations.

n_1 n_2 P = 0.5 0.2 0.1 0.05 0.01 0.005 0.001 5 5 0.089 0.161 0.225 0.225 7 7 0.079 0.135 0.158 0.199 0.304 0.304 9 9 0.077 0.125 0.155 0.187 0.266 0.286 0.384 12 12 0.075 0.122 0.153 0.186 0.256 0.284 0.344 20 20 0.069 0.117 0.151 0.185 0.261 0.293 0.367 40 40 0.069 0.117 0.152 0.186 0.264 0.298 0.374 100 100 0.069 0.117 0.152 0.187 0.267 0.300 0.378 infinite 0.071 0.117 0.152 0.187 0.268 0.304 0.385

Fuller tables can be found in Mardia (1972, p.314), Batschelet (1981, p.348), Kanji (1999, p.210), Zar (1999, Table B.38) and Mardia and Jupp (2000, p.377); that of Zar is the most extensive of these.

For the Kuiper statistics, tables can be found in Batschelet (1981, pp.341, 346-7) and in Upton and Fingleton (1989, pp.393, 395). So long as at least one sample is more than 12, k* may be compared with the following critical values (see discussion in Upton and Fingleton, 1989, p.279):

P = 0.1 0.05 0.01 0.005 0.001 1.62 1.75 2.00 2.10 2.30

Options

by() indicating grouping is a required option.

Example

. circtwosample dir, by(group)

References

Batschelet, E. 1981. Circular statistics in biology. London: Academic Press. (Edward Batschelet, 1914-1979)

Kanji, G.K. 1999. 100 statistical tests. London: Sage.

Kuiper, N.H. 1960. Tests concerning random points on a circle. Proceedings, Koninklijke Nederlandse Akademie van Wetenschappen Series A 68: 38-47. (Nicolaas Hendrik Kuiper, 1920-1994)

Mardia, K.V. 1972. Statistics of directional data. London: Academic Press.

Mardia, K.V. and Jupp, P.E. 2000. Directional statistics. Chichester: John Wiley.

Upton, G.J.G. and Fingleton, B. 1989. Spatial data analysis by example. Volume 2: Categorical and directional data. Chichester: John Wiley.

Watson, G.S. 1962. Goodness-of-fit tests on a circle. II. Biometrika 49: 57-63. (Geoffrey Stuart Watson, 1921-1998)

Zar, J.H. 1999. Biostatistical analysis. Upper Saddle River, NJ: Prentice-Hall.

Author

Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk

Also see