-------------------------------------------------------------------------------
help for bpmedian and bpdifmed                                   (Roger Newson)
-------------------------------------------------------------------------------

Bonett-Price confidence intervals for medians and their contrasts

bpmedian varname [if] [in] [ , level(#) eform fast ]

bpdifmed varname [if] [in] , by(groupvarname) [ level(#) eform fast ]

where groupvarname is the name of a grouping variable, which should only have two non-missing values.

by is allowed; see [R] by.

Description

bpmedian calculates a Bonett-Price confidence interval for a median, using the Bonett-Price standard error, and saves the results as estimation results. These can then be saved in an output dataset (or resultsset), using the parmest package (downloadable from SSC), and then input to the metaparm module of the parmest package to calculate Bonett-Price confidence intervals for a linear contrast between medians of independent groups. bpdifmed calculates Bonett-Price confidence intervals for the medians of two groups, defined by a grouping variable, and also for their difference or ratio.

Options for bpmedian and bpdifmed

level specifies the confidence level to be used for calculating the confidence intervals.

eform specifies that confidence intervals will be calculated for the exponentiated median(s), and also, in the case of bpdifmed, for the ratio between the first exponentiated median and the second exponentiated median. The eform option is useful if the input varname contains the logarithms of a primary variable, because, for a continuous positive random variable, the ratio between two exponentiated subpopulation medians of the logged variable is then the ratio between the two corresponding unexponentiated subpopulation medians of the unlogged variable. If eform is not specified, then confidence intervals are calculated for the unexponentiated median(s), and also, in the case of bpdifmed, for the difference between the first median and the second median. Note that, for a real-life variable (which is never perfectly continuous in a finite sample), the median estimate produced when using the logged variable and specifying eform may be different from the median estimate produced when using the unlogged variable and not specifying eform. This is because the unlogged variable may have two mid-range values. In this case, the median estimate produced using the unlogged variable without eform is the arithmetic mean of the two mid-range values, and the median estimate produced using the logged variable with eform is the geometric mean of the two mid-range values, and is lower than their arithmetic mean.

fast is an option for programmers. It specifies that bpmedian and bpdifmed will take no action to restore the original data if the program fails, or if the user presses Break.

Options for bpdifmed only

by(groupvarname) specifies a grouping variable, which must have exactly 2 non-missing values. bpdifmed will estimate the difference between the medians, or the ratio between the exponentiated medians, for the dependent variable specified by the varname in the two groups.

Use of predict after bpmedian

If predict is used after bpmedian, then the predicted values calculated (using predict with no options or with the xb option) will be equal to the estimated median, and the standard errors calculated (using predict with the stdp option) will be equal to the standard error of the estimated median. The score option of predict is not allowed after bpmedian.

Remarks

The Bonett-Price variance estimator fot the sample median is introduced in Price and Bonett (2001). The theory behind Bonett-Price confidence intervals for general contrasts of independent sample medians is introduced in Bonett and Price (2002). The special case of confidence intervals for the difference or ratio between the medians of two independent groups is discussed in Price and Bonett (2002). The formulas for these confidence intervals are related to the confidence interval formulas used by centile, mean and ttest, but are not the same formulas as used by either of those commands.

Note that the difference (or ratio) between medians is not the same parameter as the Hodges-Lehmann median pairwise difference (or ratio) between values of a variable in two groups, which is estimated by the cendif module of the somersd package, downloadable from SSC. The two population parameters are the same if either the two subpopulation distributions are symmetrical or the two subpopulation distributions differ only in location. The methods of bpmeddif and cendif still produce consistent estimates if neither of these assumptions is true. However, under those circumstances, the two methods are estimating different parameters, and are not alternative methods for estimating the same parameter.

Examples

.sysuse auto, clear .bpmedian weight .bpdifmed weight, by(foreign)

.sysuse auto, clear .gene logweight=log(weight) .bpmedian logweight, eform .bpdifmed logweight, eform by(foreign)

The following example demonstrates the use of bpmedian with the parmby and metaparm modules of the parmest package, downloadable from SSC. We first estimate medians for length (in inches) in even-numbered US cars, odd-numbered US cars, even-numbered non-US cars, and odd-numbered non-US cars. These medians, with their confidence limits and P-values, are stored in an output dataset (or resultsset), with one observation per car group, which is stored in the memory, overwriting the original input dataset. The new dataset is listed. We then use metaparm to estimate the difference between differences between non-US cars and US cars with odd and even sequence numbers. This difference between differences (or interaction) is listed, and not saved.

.sysuse auto, clear .gene byte odd=mod(_n,2) .parmby "bpmedian length", by(foreign odd) norestore .list .metaparm [iweight=((odd==1)-(odd==0))*((foreign==1)-(foreign==0))], list(,)

Saved results

bpmedian saves the following in e():

Scalars e(N) number of observations e(c) rank of original upper confidence limit

Macros e(cmd) bpmedian e(cmdline) command as typed e(depvar) name of dependent variable e(properties) b V

Matrices e(b) coefficient vector e(V) variance-covariance matrix of the estimators

Functions e(sample) marks estimation sample

The scalar e(c) contains the rank, in the outcome-sorted sample, of the original upper confidence limit, denoted as c in the equations of Price and Bonett (2001). The Bonett-Price standard error is an example of a standard error calculated by the inverse confidence interval method, using an original confidence interval, defined without using a standard error, and extending from the N-c+1th order statistic to the cth order statistic. The invcise package, downloadable from SSC, is also used to compute standard errors for sample statistics, using the inverse confidence interval method.

bpdifmed saves the following in r():

Scalars r(N) number of observations r(N_1) first sample size r(N_2) second sample size r(level) confidence level

Macros r(depvar) name of dependent variable r(by) name of by() variable defining groups) r(eform) eform if specified

Matrices r(cimat) matrix of sample numbers, confidence intervals and P-values

The matrix r(cimat) is displayed as output by bpdifmed. It has 5 columns, containing sample numbers, estimates, lower and upper confidence limits, and P-values, respectively. It has 3 rows, containing this information on the first sample median, the second sample median, and the difference (or ratio) between medians, respectively.

Author

Roger Newson, National Heart and Lung Institute, Imperial College London, UK. Email: r.newson@imperial.ac.uk

References

Bonett, D. G. and Price, R. M. 2002. Statistical inference for a linear function of medians: Confidence intervals, hypothesis testing, and sample size requirements. Psychological Methods 7(3): 370-383.

Price, R. M. and Bonett, D. G. 2002. Distribution-free confidence intervals for difference and ratio of medians. Journal of Statistical Computation and Simulation 72(2): 119-124.

Price, R. M. and Bonett, D. G. 2001. Estimating the variance of the sample median. Journal of Statistical Computation and Simulation 68(3): 295-305.

Also see

Manual: [R] centile, [R] mean, [R] ttest, [R] predict On-line: help for centile, mean, ttest, predict help for parmest, parmby, parmcip, metaparm, somersd, cendif, censlope, invcise if installed