help skilmack
-------------------------------------------------------------------------------

Title

skilmack -- Skillings-Mack test

Syntax

skilmack varname [if] [in], id(varname) repeated(varname) [options]

options Description ------------------------------------------------------------------------- covariance use estimated covariance matrix in place of no-ties covariance matrix forcesims(on|off) if and only if there are ties will simulations be run, unless overridden by this option reps(#) number of simulations; default is reps(1000) seed(#) specify initial value of random-number seed notable(noties|tiescov|both) suppress output table produced in no-ties section or in case of ties when covariance option is used (or both) -------------------------------------------------------------------------

Description

skilmack implements the Skillings-Mack (SM) test, which is a generalization of the Friedman test. It is particularly useful for an unbalanced/incomplete block design or in the presence of missing data. Missing data can be missing by design or missing completely at random.

N.B. The SM test is equivalent to the Friedman test when there are no missing data and is useful when there are many ties or equal ranks.

Unlike with the friedman command, the data are required to be in the more usual long format, i.e., one column for the outcome measure, one for the block identifier or ID, and one for the treatment or within-block repeated variable.

Options

id(varname) is required and specifies the factor variable containing the block identifiers.

repeated(varname) is required and specifies the factor variable containing the treatment identifiers.

covariance specifies that the estimated covariance matrix is used in place of the no-ties covariance matrix. The estimated covariance matrix is the sample covariance matrix of the weighted sum of centered ranks from the simulations.

forcesims(on|off) forces whether simulations are used. Simulations will be run if and only if there are ties, unless overridden by this option.

reps(#) sets the number of simulations. The default is reps(1000).

seed(#) specifies the random-number seed; time is used as the default seed. This option allows an exact replication of the Monte Carlo simulations.

notable(noties|tiescov|both) suppresses the output table produced in the no-ties section or in the case of ties when the covariance option is used (or both).

Remarks

The following data are taken from Brady (1969):

Dysfluencies under each condition +-----------------+ | id R A N | |-----------------| | 1 3 5 15 | | 2 1 3 18 | | 3 5 4 21 | | 4 2 . 6 | | 5 0 2 17 | | 6 0 2 10 | | 7 0 3 8 | | 8 0 2 13 | +-----------------+ Use reshape to reshape the data into the long format; this is, after prefixing each condition with "score" by using rename, type

. reshape long score, i(id) j(cond) string +--------------------+ | id cond score | |--------------------| | 1 A 5 | | 1 N 15 | | 1 R 3 | |--------------------| | 2 A 3 | | 2 N 18 | | 2 R 1 | |--------------------| | 3 A 4 | | 3 N 21 | | 3 R 5 | |--------------------| | 4 A . | | 4 N 6 | | 4 R 2 | |--------------------| | 5 A 2 | | 5 N 17 | | 5 R 0 | |--------------------| | 6 A 2 | | 6 N 10 | | 6 R 0 | |--------------------| | 7 A 3 | | 7 N 8 | | 7 R 0 | |--------------------| | 8 A 2 | | 8 N 13 | | 8 R 0 | +--------------------+

The SM results from the above data are

. skilmack score, id(id) repeated(cond)

Weighted Sum of Centered Ranks

cond | N WSumCRank SE WSum/SE -------+------------------------------------- A | 7 -1.73 3.74 -0.46 N | 8 13.12 3.87 3.39 R | 8 -11.39 3.87 -2.94 --------------------------------------------- Total 0

Skillings Mack = 13.281 P-value (No ties) = 0.0013 N.B. As P-value <0.02, it is likely to be conservative (unless n l > arge). Consider obtaining a p-value from a simulated null distribution of > SM - see options.

N.B. A large negative WSumCRank (or WSum/SE) means a low ranking (e.g., 1) because of typically low scores. This was the case for condition R, which had the fewest dysfluencies.

Simulations and ties

Simulations are preferable for obtaining more accurate small p-values if the sample size is not large, because the p-value from the chi-squared approximation is likely to be conservative here (Skillings and Mack 1981).

If there are ties (equal ranks), average ranks are assigned, e.g., 1.5, 1.5, 3. Assigning average ranks is perhaps the most common way of dealing with ties. However, one may prefer to force ranks to be randomly assigned when they are tied. (This can effectively be done by adding a small random amount to each score.)

The SM statistic can be calculated when there are ties; however, the p-value calculated from the assumed chi-squared null distribution becomes more and more conservative the more ties there are. To provide a more accurate p-value, simulations are used to approximate the distribution of SM values under the null hypothesis, and conditional on the particular missing-data structure and tied rankings.

A dataset is simulated by sorting on random numbers, for each individual, to randomly shuffle which data point belongs to which repeat. The sorting on random numbers is not applied where there are missing data to preserve the missing-data structure.

With the covariance option, the SM statistic can be redefined by estimating the covariance matrix of the weighted sums of centered ranks and using this in place of the covariance matrix (which is accurate when there are no ties, but not when there are many ties). A new table is produced with different standard errors, and a new SM statistic and p-value are calculated. The tables can be suppressed by using the notable option.

Definition of SM statistic

SM = A' (sigma0^-1) A , where A is a column vector of all but one of the weighted [by sqrt(12/ si+1), where si is the number of measures for person i] sums of centered ranks, and (sigma0^-1) is any generalized inverse of the covariance matrix.

References

Brady, J. P. 1969. Studies on the metronome effect on stuttering. Behaviour Research and Therapy 7: 197-204.

Skillings, J. H., and G. A. Mack. 1981. On the use of a Friedman-type statistic in balanced and unbalanced block designs. Technometrics 23: 171-177.

Authors

Mark Chatfield Medical Research Council Human Nutrition Research Cambridge, UK mdc_england@hotmail.com

Adrian Mander Medical Research Council Human Nutrition Research Cambridge, UK

Also see

Online: friedman (if installed)