------------------------------------------------------------------------------- help forinvcise(Roger Newson) -------------------------------------------------------------------------------

Compute standard errors using the inverse confidence interval method

invciselb_varnameub_varname[dof_varname] [if] [in] ,stderr(newvarname)[eformestimate(varname)level(#)replacefloatfast]where

lb_varname,ub_varnameanddof_varnameare the names of existing variables, containing lower confidence bounds, upper confidence bounds, and degrees of freedom, respectively.

Description

invciseis intended for use in an output dataset (or resultsset), with one observation for each of a set of estimated parameters, and variables containing their confidence limits, and (optionally) containing the degrees of freedom used to calculate these confidence limits. Such datasets may be produced using the official Statastatsbyprefix, or by theparmestpackage, downloadable from SSC.invciseuses the confidence limits to compute a new variable, containing standard errors for the parameters, using the inverse confidence interval method. These standard errors, together with parameter estimates in another variable in the dataset, may be used to calculate standard errors and confidence intervals for linear combinations of these parameters, using themetaparmmodule of theparmestpackage, assuming that the parameters are independently estimated. The inverse confidence interval method is frequently used with rank statistics, such as medians, median differences, and median slopes, to compute confidence intervals for linear combinations of these rank statistics, particularly differences between differences ("interactions") or weighted means of several differences ("meta-analysis summaries").

Options

stderr(newvarname)is required. It specifies the name of a new variable to be created, containing standard errors computed from the input confidence limit variables using the inverse confidence interval method.

eformestimate(varname)specifies the name of a variable, assumed to be an exponentiated estimate corresponding to the input confidence limits, and implying that the standard error must be calculated from the log ratio of the confidence limits, multiplied by theeformestimate()variable, and then scaled inversely by twice the criticalt-value orz-value corresponding to the confidence level specified bylevel(). Ifeformestimate()is not specified, then the standard error is calculated from the difference between the confidence limits, scaled inversely by twice the criticalt-value orz-value corresponding to the confidence level specified bylevel(). Theeformestimate()option is useful if the standard errors are used with theeformestimate()variable for input to themetaparmorparmcipmodules of theparmestpackage, using theeformoption of these modules to produce exponentiated confidence intervals. Such exponentiated confidence intervals may be used to estimate parameters which are ratios, ratios of ratios, or geometric mean ratios.

level(#)specifies the confidence level assumed for the input confidence limits, expressed as a percentage. Iflevel()is not specified, theninvcisefirst attempts to extract the confidence level from the variable characteristiclb_varname[level], and then (if this attempt fails) attempts to extract the confidence level fromub_varname[level], and then (if this attempt also fails) extracts the confidence level from the c-class valuec(level), which contains the default confidence level in force in Stata at the time, which is usually set to 95 to specify 95% confidence limits. The variable characteristicvarname[level]is created, for a confidence limit variable with the namevarname, by the modules of theparmestpackage, which all set this characteristic to be equal to the confidence level used in calculating the confidence limit variable.

replacespecifies that any non-input variable with the same name as the new variable specified by thestderr()option will be discarded before the new standard error variable is created.

floatspecifies that float is the highest-precision numeric type to be allowed for thestderr()variable. Iffloatis not specified, then thestderr()variable is created as a double variable. Whether or notfloatis specified, thestderr()variable is compressed to the lowest precision possible without loss of informstion.

fastis an option for programmers. It specifies thatinvcisewill take no action to restore the existing dataset in memory in the event of failure, or if the user presses Break. Iffastis not specified, theninvcisewill take this action, which uses an amount of time depending on the size of the dataset in memory.

Methods and formulas

invcisecomputes standard errors using the inverse confidence interval method, which is an inversion of the method commonly used to compute confidence limits from estimates and standard errors.The default formula (if

eformestimate()is not specified) used to derive a standard errorSEby inverting a 100*(1-alpha)% confidence interval with lower boundlband upper boundubis

SE = 0.5*(ub - lb)/z(alpha)(where

z(alpha)is the result ofinvnorm(1-alpha/2)) if no degrees of freedom variable is specified, and is

SE = 0.5*(ub - lb)/t(df,alpha)(where

t(df,alpha)is the result ofinvttail(df,1-alpha/2)anddfis the degrees of freedom) if a degrees of freedom variable is specified.If the

eformestimate()option is specified, then the formula used is

SE = 0.5*eformestimate*(log(ub) - log(lb))/z(alpha)(where

eformestimateis the variable specified byeformestimate()) if no degrees of freedom variable is specified, and is

SE = 0.5*eformestimate*(lof(ub) - log(lb))/t(df,alpha)if a degrees of freedom variable is specified.

These formulas are typically used with confidence intervals for rank statistics, such as percentiles and percentile differences. Lehmann (1963) discussed a standard error formula of this kind for Hodges-Lehmann median differences. McKean and Schrader (1984) discussed a standard error formula of this kind for medians, which was slightly modified by Bonett and Price (2001).

Usually, standard error formulas are a means to the end of calculating confidence intervals. The reason for inverting the usual practice is to calculate confidence intervals for linear combinations of independently estimated parameters, such as medians or median differences from independent subsamples from distinct subpopulations. These linear combinations are typically either weighted averages, or differences, or weighted averages of differences (as in a meta-analysis), or differences between differences (known as interactions, and viewed as important by some scientists). Bonett and Price (2002) discuss the general case of linear combinations of medians, and Price and Bonett (2002) discuss the special case of differences (and ratios) between two medians. Given a list of independently-estimated parameters

theta_1, ..., theta_N, with corresponding standard errorsse_1, ..., se_N, and corresponding coefficientsa_1, ..., a_N, we wish to estimate the linear combination

Theta = Sum ( a_j * theta_j )and its standard error

SE = sqrt( Sum (a_j * se_j)^2)and we can easily do this using the

metaparmmodule of theparmestpackage, once the standard errors have been calculated usinginvcise. We usually expect the Central Limit Theorem to work better for the linear combination than for its component parameters, which may be better estimated using their original confidence intervals, which were inverted usinginvciseto give their standard errors.

ExamplesThe following sequence of commands reads in the

autodata and adds a variableodd, indicating whether a car model is odd-numbered or even-numbered. This dataset is used in the examples, which compare differences in mileage between non-US cars and US cars within the odd-numbered and even-numbered groups.

.sysuse auto, clear.gene byte odd=mod(_n,2).lab def odd 0 "Even" 1 "Odd".lab val odd odd.lab var odd "Odd numbered model".describe.tab foreign odd, mThe following example starts by using

centile, with thestatsbyprefix, to replace the dataset in memory with a new dataset, with one observation for each of 4 groups, defined by combinations of values for the variablesoddandforeign, and variables containing group numbers inN, and estimates and lower and upper confidence bounds for the group medians inmedian,medminandmedmax. We then useinvciseto compute a standard error for each median, and usemetaparmto replace the new dataset with a third dataset, with one observation per group defined by a value ofodd, and data on confidence intervals andP-values for differences between median values in non-US and US cars in the group. The secondmetaparmcommand lists a confidence interval for the difference (or interaction) between the foreign-US difference in odd-numbered models and the foreign-US difference in even-numbered models. The thirdmetaparmcommand lists a confidence interval for the weighted mean foreign-US difference, averaging the differences in odd-numbered and even-numbered cars.

.preserve.statsby N=r(N) median=r(c_1) medmin=r(lb_1) medmax=r(ub_1), by(oddforeign) noisily clear: centile mpg.list odd foreign N median medmin medmax.invcise medmin medmax, stderr(icse).metaparm [iweight=(foreign==1)-(foreign==0)], by(odd) norestoresumvar(N) estimate(median) stderr(icse).list odd N median min95 max95 p.metaparm [iweight=(odd==1)-(odd==0)], sumvar(N) estimate(median)stderr(icse) list(,).metaparm [aweight=N], sumvar(N) estimate(median) stderr(icse)list(,).restoreThe following example compares Hodges-Lehmann median foreign-US differences, which are not necessarily the same parameters as foreign-US differences between medians. We start by using the

censlopemodule of thesomersdpackage, together with theparmbymodule of theparmestpackage, to replace the dataset in memory with a new dataset, with one observation per value ofodd, and data on confidence intervals andP-values for foreign-US median differences. We then useinvciseto compute standard errors inversely from the confidence limits. The firstmetaparmcommand lists a confidence interval and aP-value for the odd-even difference (or interaction) between foreign-US median differences. The secondmetaparmcommand lists a confidence interval for the weighted mean of the two foreign-US median differences, summarizing the foriegn-US differences in the two groups. The confidence intervals are slightly slimmer than the corresponding confidence intervals in the previous example, although they are for different parameters.

.preserve.parmby "censlope mpg foreign, tdist estaddr", by(odd) escal(N)norestore ecol(cimat) rename(es_1 N ec_1_1 percent ec_1_2 meddifec_1_3 mdmin ec_1_4 mdmax).describe.list odd N dof meddif mdmin mdmax.invcise mdmin mdmax dof, stderr(icse).metaparm [iweight=(odd==1)-(odd==0)] , sumvar(N) estimate(meddif)stderr(icse) dof(dof) list(,).metaparm [aweight=N], sumvar(N) estimate(meddif) stderr(icse)dof(dof) list(,).restoreThe following example is similar to the previous example, but compares Hodges-Lehmann median foreign/US ratios instead of Hodges-Lehmann median foreign/US differences. We start by creating the variable

logmpgas the log ofmpg, and estimate the Hodges-Lehmann median ratios by exponentiating the Hodges-Lehmann median differences forlogmpg. We then useinvcise, with theeformestimate()option, to calculate inverse confidence interval standard errors for the median ratios. These are then input intometaparmas before, except that, this time, we use theeformoption ofmetaparm, to estimate the odd/even ratios between foreign/US ratios, and to estimate the weighted geometric mean foreign/US ratio.

.preserve.gene logmpg=log(mpg).parmby "censlope logmpg foreign, tdist estaddr eform", eform by(odd)escal(N) norestore ecol(cimat) rename(es_1 N ec_1_1 percentec_1_2 medrat ec_1_3 mrmin ec_1_4 mrmax).describe.list odd N dof medrat mrmin mrmax.invcise mrmin mrmax dof, stderr(icse) eformestimate(medrat).metaparm [iweight=(odd==1)-(odd==0)] , sumvar(N) estimate(medrat)stderr(icse) dof(dof) eform list(,).metaparm [aweight=N], sumvar(N) estimate(medrat) stderr(icse)dof(dof) eform list(,).restoreThe

parmestandsomersdpackages can both be downloaded from SSC.

Saved results

invcisesaves the following inr():Scalars

r(level)confidence levelMacros

r(lb)name of lower confidence bound variabler(ub)name of upper confidence bound variabler(dof)name of degrees of freedom variabler(eformestimate)name ofeformestimate()variabler(levelsource)source of confidence level

The returned result

r(levelsource)may belevel(),lb_varname[level],ub_varname[level], orc(level), indicating that the confidence level was derived from thelevel()option, from thelevelcharacteristic of the lower bound variable, from thelevelcharacteristic of the upper bound variable, or from the c-class valuec(level), respectively.

AuthorRoger Newson, National Heart and Lung Institute, Imperial College London, UK. Email: r.newson@imperial.ac.uk

ReferencesBonett, D. G. and Price, R. M. 2002. Statistical inference for a linear function of medians: Confidence intervals, hypothesis testing, and sample size requirements.

Psychological Methods7(3): 370-383.Lehmann, E. L. 1963. Nonparametric confidence intervals for a shift parameter.

Annals of Mathematical Statistics34(4): 1507-1512.McKean, J. W. and Schrader, R. M. 1984. A comparison of methods for studentizing the sample median.

Communications in Statistics -Simulation and Computation13(6): 751-773.Price, R. M. and Bonett, D. G. 2002. Distribution-free confidence intervals for difference and ratio of medians.

Journal ofStatistical Computation and Simulation72(2): 119-124.Price, R. M. and Bonett, D. G. 2001. Estimating the variance of the sample median.

Journal of Statistical Computing and Simulation68(3): 295-305.

Also seeManual:

[R] centile,[D] statsbyOn-line: help forcentile,statsbyhelp forparmest,parmby,parmcip,metaparm,somersd,censlope,cendifif installed