```-------------------------------------------------------------------------------
help for circular statistics commands
-------------------------------------------------------------------------------

Introduction

Circular data are a large class of directional data, which are of
interest to scientists in many fields, including biologists (movements of
migrating animals), meteorologists (winds), geologists (directions of
joints and faults) and geomorphologists (landforms, oriented stones).
Such examples are all recordable as compass bearings relative to North.
Other examples include phenomena that are periodic in time, including
those dependent on time of day (of interest to biomedical statisticians:
hospital visits, times of birth, etc.) or time of year (of interest to
applied economists: unemployment or sales variations).

The elementary but also fundamental property of circular data is that the
beginning and end of the scale coincide: for example, 0 degrees = 360
degrees.  An immediate implication is that the arithmetic mean is likely
to be a poor summary: the mean of 1° and 359° cannot sensibly be 180°.
The solution is to use the vector mean direction as circular mean. More
generally, the different outcome space means that many standard methods
designed for variables measured on the line are of little or no use with
circular variables.

The programs written rest, so far, on the assumption that data are
recorded in degrees. Users working with other scales (e.g. time of day on
a 24 hour clock, day or month of year) could write their own trivial
preprocessor and fix cosmetic details such as graph axis labels. In due
course I may implement, possibly through characteristics modified by some
circset command, user setting of different scales. Stata expects angles
for reporting data. In Stata, the factors _pi / 180 and 180 / _pi are
thus useful for conversion between angles and radians.

In addition, the compass or clock convention of measurement clockwise
from a vertical axis (e.g. North) is used throughout for circular graphs,
not the mathematical convention of measuring angles counterclockwise
(anticlockwise) from a horizontal axis.

The degree symbol may be invoked (e.g. in text for graphs) as "{c 176}".
If that fails, try "`=char(176)'". To see such symbols in various Stata
windows, you may need to change the font.

Utilities

circcentre rotates a set of directions to a new centre: the result is
between -180° and 180°.

circdiff measures difference between circular variables or constants as
the shorter arc around the circle.

fourier generates pairs of sine and cosine variables sin j theta, cos j
theta for j = 1, ..., k.

atan2() is an arctangent egen function giving results on the whole
circle.

Summary statistics and significance tests

circsummarize is a basic workhorse that calculates vector mean and
strength and the circular range and offers, as options, approximate
confidence intervals for the vector mean and Rayleigh and Kuiper tests of
uniform distribution on the circle.  (The abbreviation circsu is
allowed.)

circrao carries out a uniformity test suggested by J.S. Rao.  One merit
of this test is that it works well for data which are not unimodal.

If circular data arrive coarsely grouped (e.g. 4 or 8 points of the
compass), then a chi-square test as applied by chitest or chitesti is a
possible alternative test of uniformity. If data are measured more
precisely, then it is arguable that the chi-square test is a poor choice
compared with the alternatives: not only does it require arbitrary
decisions on bin width and origin, it takes no account of the circular
nature of the data.

circmedian calculates the circular median and mean deviation from the
median.

circovmean and circovstr show the effects of omitting individual values
on the vector mean and the vector strength.

circtwosample and circwwmardia offer nonparametric tests for comparing
two or more subsets of directions. circtwosample offers two test
statistics based on empirical distribution functions to test whether two
distributions are identical, namely Watson's U˛ and Kuiper's k*.
circwwmardia carries out a homogeneity test due to Wheeler and Watson and
to Mardia given subdivision into two or more groups.

Univariate and bivariate graphics

circrplot loosely resembles spikeplot; circdplot loosely resembles
dotplot.  circvplot shows the ordered directions added end to end with
the vector mean as resultant.  Many users like such intrinsically
circular representations, but note that it may be necessary to use graph
display, typically with equal xsize() and ysize(), to fix the aspect
ratio.

Another approach is to wrap around the scale, showing up to two full
cycles on a linear graph. circhistogram is a wrapper for histogram,
adding a pad of values (default 180°) to both extremes. (The abbreviation
circhist is allowed.) circscatter is a wrapper for scatter that adds a
pad to both extremes on either or both of x and y axes.  (The
abbreviation circsc is allowed.)

circqqplot is a quantile-quantile plot for two circular variables. It is
a wrapper for qqplot. Data are rotated so that each variable is centred
on a specified value.

Note that a quantile plot of directions can be useful: quantile - or
alternatively, qplot (see Stata Journal 4(1): 97, 2004) - is already
available for this purpose.

Smoothing, relationships and modelling

circkdensity drives a nonparametric density estimation routine with
biweight kernel. Despite the name, it does not call kdensity.

For exploratory smoothing, circylowess is for circular response and
non-circular covariate and circxlowess is for non-circular response and
circular covariate. Both are wrappers for lowess.  With circylowess, the
recipe is to smooth sine and cosine components and to recombine using
arctangent: smooth of theta = arctan(smooth of sin theta, smooth of cos
theta).  With circxlowess, the recipe is to smooth around the circle by

circlccorr and circcorr implement correlation methods for cases where one
or both variables are circular.

Note that regression of a non-circular response on various terms of a
Fourier series requires nothing extra in Stata beyond regress and other
basic modelling commands (although fourier can help in producing
covariates).  It is often extremely useful, and can be extended to
include non-circular covariates.

von Mises distributions

circvm fits a von Mises distribution, the most important unimodal
reference distribution on the circle, using an approximate maximum
likelihood method. (Doing it properly with ml is on the agenda.)

circqvm shows a quantile-quantile plot for data versus a fitted von Mises
distribution. Data are rotated so that the mean is at the centre of the
plot.

circpvm shows a probability plot (P-P plot) for data versus a fitted von
Mises distribution.

circdpvm shows a density probability plot for data versus a fitted von
Mises distribution.

egen functions invvm(), rndvm(), vm() and vmden() and a calculator
function i0kappa are supporting utilities, occasionally used directly.

Author

Nicholas J. Cox, University of Durham, U.K.
n.j.cox@durham.ac.uk

Acknowledgements

Ian S. Evans has kindly provided me with information and requests on
circular statistics over more than thirty years.

```