------------------------------------------------------------------------------- help formultprocandsmileplot7(Roger Newson) -------------------------------------------------------------------------------

Multiple test procedures and Stata 7 smile plots

multproc[ifexp] [inrange] [ ,puncor({#|scalarname|varname})pcor({#|scalarname|varname})method(method_name)pvalue(varname)rank(newvarname)gpuncor(newvarname)critical(newvarname)gpcor(newvarname)nhcred(newvarname)reject(newvarname)floatfast]

smileplot7[ifexp] [inrange] [ ,estimate(varname)logbase(#)nline(#)ptsymbol(symbol)ptlabel(varname)by(varname)multproc_optionsgraph_options]

byvarlist:can be used withmultprocandsmileplot7. (See help for by.) Ifbyvarlist:is used, then the log output, and all generated variables, are calculated using the specified multiple test procedure within each by-group defined by the variables in thevarlist.

Description

smileplot7provides access to the Stata 7 version of smileplot, and is provided for users of Stata versions 8 or above who want to produce smile plots using the "quick and dirty" Stata 7 graphics. (Note that Stata 7 users can still click here to download the Stata 7 version of smileplot under its old name from Roger Newson's website at http://www.imperial.ac.uk/nhli/r.newson, and this may save them from having to modify their Stata programs if and when they upgrade to a higher version of Stata.)multproctakes, as input, a data set with one value for each of a set of multiple statistical tests of multiple null hypotheses, including a variable containingP-values for these tests, and an uncorrected overall criticalP-value specified by the user, and carries out a multiple test procedure. A multiple test procedure calculates a corrected overall criticalP-value, which has the feature that an individual null hypothesis is considered to be acceptable if and only if its correspondingP-value is greater than the corrected overall criticalP-value.smileplot7takes, as input, a data set with one observation for each of a set of estimated parameters, and data on their estimates andP-values.smileplot7callsmultprocto carry out a multiple test procedure, and then creates a smile plot, with data points corresponding to estimated parameters, the correspondingP-values (on a reverse log scale) on theY-axis, and another variable (usually the corresponding parameter estimates) on theX-axis. There areY-axis reference lines corresponding to the uncorrected and corrected overall criticalP-values. TheY-axis reference line corresponding to the corrected overall criticalP-value is known as the parapet line. Data points on or above the parapet line correspond to rejected null hypotheses. There may be a reference line on theX-axis corresponding to the value of the parameter under a null hypothesis (defaulting to 1 if theX-axis is logged, 0 otherwise). The user can therefore see, at a glance, both the statistical significance and the practical significance of each parameter estimate, and can also see the parapet line as an "upper confidence bound" on theY-axis for how many of the corresponding null hypotheses are true.multprocandsmileplot7are usually used on data sets with one observation per parameter estimate and data on estimates and theirP-values. Such data sets may be created (directly or indirectly) by postfile, statsby, parmby or parmest.

Options formultprocandsmileplot7

puncor({#|scalarname|varname})specifies the uncorrected overall criticalP-value for statistical significance. This option may be specified either as a number, or as a scalar, or as a variable (in which case the variable is expected to contain only one non-missing value in the sample or by-group). If absent, this option is set to 1-$S_level/100, where$S_levelis the value of the currently set default confidence level.

pcor({#|scalarname|varname})specifies the corrected overall criticalP-value for statistical significance. This option may be specified either as a number, or as a scalar, or as a variable (in which case the variable is expected to contain only one non-missing value in the sample or by-group). If absent, this option is set by the method specified in themethodoption (see below).

method(method_name)specifies the multiple test procedure method to be used for deriving the correctedP-value threshold from the uncorrectedP-value threshold. This option is ignored, and set touserspecified, if thepcoroption is specified and in the range 0 <=pcor<= 1. Otherwise, ifmethodis absent, then it is set tobonferroni.

pvalue(varname)is the name of the variable containing theP-values. If this option is absent, thenmultproclooks for a variable namedp(as created by parmby or parmest).multproccarries out a multiple test procedure on all observations selected by the if and/or in qualifiers which also have non-missing values for the variable containing theP-values.

rank(newvarname)is the name of a new variable to be generated, containing, in each observation, the rank of the correspondingP-value, from the lowest to the highest. TiedP-values are ranked according to their position in the input data set. Ifbyvarlist:is specified withmultproc, then the ranks are defined within the by-group.

gpuncor(newvarname)is the name of a new variable to be generated, containing, in each observation, the uncorrected overall criticalP-value, as specified by thepuncoroption, or by the standard default if thepuncoroption is not specified. This new variable will have the same value for all observations in the sample of observations used bymultprocorsmileplot7.

critical(newvarname)is the name of a new variable to be generated, containing, in each observation, an individual criticalP-value corresponding to the originalP-value in the variable specified bypvalue. The values of the individual criticalP-values are defined by a non-decreasing function (specified by themethodoption) of the ranks of the corresponding originalP-values (generated by therankoption). The corrected overall criticalP-value is selected from the individual criticalP-values in a way specified by themethodoption, depending on whether the method specified is a one-step method, a step-down method, or a step-up method.

gpcor(newvarname)is the name of a new variable to be generated, containing, in each observation, the corrected overall criticalP-value, as specified by thepcoroption, or by themethodoption if thepcoroption is not specified. Ifbyvarlist:is specified withmultproc, then the value of this new variable will be the same in all observations within each by-group, but may be different for observations in different by-groups, if a step-up or step-down procedure is specified by themethodoption.

nhcred(newvarname)is the name of a new variable to be generated, containing, for each observation, an indicator of the credibility of the corresponding null hypothesis under the method specified by themethodoption. This indicator is 1 if the null hypothesis is credible, and 0 otherwise. A null hypothesis is said to be credible if itsP-value is greater than the corrected overall criticalP-value. The set of observations with a value of 1 corresponds to a set of credible null hypotheses. The exact interpretation of the set of credible null hypotheses depends on whether the method specified controls the family-wise error rate (FWER) or the false discovery rate (FDR).

reject(newvarname)is the name of a new variable to be generated, containing, for each observation, an indicator of the rejection of the corresponding null hypothesis under the method specified by themethodoption. This indicator is 1 if the null hypothesis is rejected, and 0 otherwise. The new variable generated by therejectoption is therefore the negation of the new variable generated by thenhcredoption.

floatspecifies that the individual criticalP-value variable specified bycritical(if requested) will be created as afloatvariable. Iffloatis absent, then thecriticalvariable is created as adoublevariable. Whether or notfloatis specified, all generated variables are stored to the lowest precision possible without loss of information.

fastis an option for programmers. It specifies thatmultprocandsmileplot7will not take any action so that it can restore the original data if the user pressesBreak.

Options available forsmileplot7only

estimate(varname)is the name of the variable to be plotted on theX-axis, usually containing the parameter estimates corresponding to theP-values specified by thepvalueoption. If this option is absent, thensmileplot7looks for a variable namedestimate(as created by parmby or parmest).smileplot7carries out a multiple test procedure by callingmultprocfor observations with non-missing values for the variables specified by theestimateandpvalueoptions, using the if and/or in qualifiers if these are supplied by the user. Note that the variable specified byestimatemay contain values that are not parameter estimates. For instance, the observations may correspond to genes in a genome scan, theP-values may be derived from tests for associations of those genes with a disease, and theX-axis variable specified by theestimateoption may contain the positions of those genes on a chromosome map.

logbase(#)specifies a log base used to define theY-axis labels. This log base is a factor by which eachY-axis label is divided to arrive at the nextY-axis label, where theY-axis labels are ordered from the highestP-value to the lowestP-value. If absent, this option is set to 10, so the Y-labels are set to non-positive powers of 10. If this rule defines too manyY-axis labels, then theY-axis labels are set to be everykth member of the logarithmic series, wherekis the minimum positive integer such that the number ofY-axis labels defined in this way is not too large.

nline(#)specifies the position, on theX-axis, of the reference line indicating the value of the estimated parameters under the null hypothesis. Ifnlineis unspecified, then it is set to 1 ifxlogis specified and to 0 otherwise. This option allows the user to plot odds ratios and geometric mean ratios on a linear scale, instead of on the more usual log scale. Ifnlineis set to a missing value by specifyingnline(.), then the null reference line is suppressed. This is useful for creating "smile plots" for which theX-axis variable specified by theestimateoption contains values other than parameter estimates, such as positions of genes on a chromosome map.

ptsymbol(symbol)specifies a graph symbol for the data points of the smile plot. If absent, it is set toT(triangles).

ptlabel(varname)specifies a variable to be used to label the data points. If absent, then there are no data point labels, only unlabelled data points.

by(varname)is a graph option, and works as for graph, creating one plot for each by-group, arranged in a square array. The corrected overall criticalP-value, indicated by the parapet line, is calculated for all theP-values from all the by-groups pooled together, not for the subset ofP-values in each by-group individually. (This is in contrast to the use ofbyvarlist:, which causes corrected individual and overall criticalP-values to be calculated only from the subset ofP-values in each by-group.)

RemarksMultiple test procedures and smile plots are reviewed in Newson

et al.(2003). The smile plot is so named because, if the standard errors of the parameters are similar, then the data points fall along a curve shaped like a smile. It summarises a set of multiple parameter estimates graphically, in the way that a Cochrane forest plot summarises a meta-analysis. TheY-axis reference line corresponding to the corrected overall criticalP-value is known as the parapet line. Data points on or above the parapet line correspond to parameters for which we can reject the null hypotheses under the specified multiple test procedure. Data points below the parapet line correspond to parameters for which the null hypotheses are credible (acceptable).The methods specified by the

methodoption are multiple test procedures for defining an upper confidence bound for the set of null hypotheses that are true, given multiple parameter estimates with multipleP-values. More formally, each method defines a set of credible (or acceptable) null hypotheses and a set of incredible (rejected) null hypotheses, whose exact interpretation depends on the method. The uncorrected overallP-value may either be treated as an upper bound for the family-wise error rate (FWER), or be treated as an upper bound for the false discovery rate (FDR).The FWER is the probability that at least one true null hypothesis is rejected. If a method controls the FWER, then the power set of the set of credible null hypotheses is a power-set-valued confidence region for a set-valued parameter, namely the set of null hypotheses which are true. We can therefore say, with a confidence level of 100*(1-

puncor) percent, that the set of null hypotheses that are true is some subset (possibly empty) of the set of credible null hypotheses. In other words, we are 100*(1-puncor) percent confident that all the rejected null hypotheses are false. FWER-controlling procedures are reviewed in Wright (1992).The FDR is defined as follows. Let

Vdenote the number of true null hypotheses rejected, and letRdenote the total number of null hypotheses rejected. Then the FDR is equal to the expectation ofQ, whereQis defined to be equal toV/RifR>0, and equal to zero ifR=0. The probability thatQ=1can be no more than the FDR. Therefore, if the method controls the FDR, then we can say, with 100*(1-puncor) percent confidence, that the set of null hypotheses that are true is a subset of null hypotheses (possibly empty) which does not contain the rejected set as a non-empty subset. In other words, we are 100*(1-puncor) percent confident that at least some of the rejected null hypotheses are false. If the number of null hypotheses tested is very large indeed, then, arguably, we may be 100 percent confident that 100*(1-puncor) percent of the rejected null hypotheses are false.The methods may also be classified into one-step, step-down and step-up procedures. All three classes of methods work by defining a list of

mindividual criticalP-valuesC_1,...,C_m, one for each of themindividual inputP-valuesP_1,...,P_m, ranked from the lowest to the highest. These individual criticalP-values can be saved as output using thecriticaloption, and are defined as a non-decreasing function of the ranks of the originalP-values, which can be saved as output using therankoption. An overall corrected criticalP-valuepcoris selected from the individual criticalP-values. A null hypothesis is acceptable if and only if itsP-value is greater than the overall corrected criticalP-value. For a one-step procedure, theC_iare all equal to the overall corrected criticalP-valuepcor, which is defined as a function of the uncorrected criticalP-valuepuncor. For a step-down procedure,pcoris equal to the lowestC_isuch thatP_i > Ci, if such aC_iexists, and equal toC_motherwise. For a step-up procedure,pcoris equal to the highestC_isuch thatP_i <= C_i, if such aC_iexists, and equal toC_1otherwise.The different methods use different assumptions. Some assume that the different

P-values are statistically independent, others allow the differentP-values to be non-negatively correlated, and others allow the differentP-values to be arbitrarily correlated. The more recently developed methods are documented in their original source papers. The available methods are as follows:

MethodStep typeFWER/FDRDefinition or sourceuserspecified One-step Eitherpcoroption bonferroni One-step FWERpcor=puncor/msidak One-step FWERpcor=1-(1-puncor)^(1/m)(or Sidak, 1967) holm Step-down FWER Holm, 1979 holland Step-down FWER Holland and Copenhaver, 1987 liu1 Step-down FDR Benjamini and Liu, 1999a liu2 Step-down FDR Benjamini and Liu, 1999b hochberg Step-up FWER Hochberg, 1988 rom Step-up FWER Rom, 1990 simes Step-up FDR Benjamini and Hochberg, 1995 (or Benjamini and Yekutieli, 2001 (first method)) yekutieli Step-up FDR Benjamini and Yekutieli, 2001 (second method) krieger Step-up FDR Benjamini, Krieger and Yekutieli, 2001Note that, in the case of the

hollandmethod, the procedure used is the simplified (and less powerful) version of the procedure of Holland and Copenhaver (1987), which takes no account of logical dependencies between the null hypotheses, although it takes advantage of non-negative dependencies between theP-values. Thesimesmethod is so named because it was proposed in Simes (1986), although its justification in terms of the FDR was presented in the references indicated above.

ExamplesIf we type the following example in the auto data, then a smile plot will be produced with 1 observation per parameter of the fitted model. The corrected

P-value defines an upper confidence bound for how many of these parameters are 0 in the population from which these cars were sampled.. parmby "xi:regress mpg i.rep78 i.foreign",label norestore . smileplot7 if parm!="_cons",me(holm) ptl(label)

If we type the following example in the auto data, then a pair of smile plots will be created, one for US-made cars and one for non-US cars, with one data point for each parameter of the model (other than the intercept). The corrected

P-value is corrected for the total number of parameters for both car types (US and non-US).. parmby "xi:regress mpg weight i.rep78",label norestore by(foreign) . smileplot7 if parm!="_cons",ptl(parm) by(foreign)

The following advanced example demonstrates the use of

byvarlist:together with thebyoption ofsmileplot7. The example assumes that there is a data set in memory, with 1 observation per parameter estimate. The data set contains variablesorandsiglev, containing estimated odds ratios andP-values respectively, and also identifier variablesoutcome,exposure,subsetandadjusted. The programmultprocis used to carry out the Simes method on each subset defined by the variableadjusted, storing the uncorrected and corrected overall criticalP-values in new variablesuncpandcorp, and a hypothesis rejection indicator in a new variablesignif. We then usesmileplot7to create, for each combination of values ofadjustedandoutcome, an array of smile plots for each value ofsubset, with data points labelled by the value ofexposure. Finally, the rejected null hypotheses are listed.. sort adjusted outcome subset exposure . by adjusted:multproc,pval(siglev) meth(simes) gpunc(uncp) gpcor(corp) rej(signif) . by adjusted outcome:smileplot7,est(or) pval(siglev) punc(uncp) pcor(corp) by(subset) ptl(exposure) xlog t1(" ") . by adjusted outcome:list if signif,nodisp

AuthorRoger Newson, Imperial College London, UK. Email: r.newson@imperial.ac.uk

ReferencesBenjamini, Y. and Y. Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing.

Journal of the RoyalStatistical Society B57: 289-300.Benjamini Y., A. Krieger, and D. Yekutieli. 2001. Two staged linear step-up FDR controlling procedure. Pre-publication draft downloadable from Yoav Benjamini's website at http://www.math.tau.ac.il/~ybenja/.

Benjamini, Y. and W. Liu. 1999a. A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence.

Journal ofStatistical Planning and Inference82: 163-170. Pre-publication draft downloadable from Yoav Benjamini's website at http://www.math.tau.ac.il/~ybenja/.Benjamini, Y. and W. Liu. 1999b. A distribution-free multiple-test procedure that controls the false discovery rate. Report, Dept. of Statistics and OR, Tel Aviv University, RP-SOR-99-3. Pre-publication draft downloadable from Yoav Benjamini's website at http://www.math.tau.ac.il/~ybenja/.

Benjamini, Y. and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency.

Annals of Statistics29: 1165-1188. Pre-publication draft downloadable from Yoav Benjamini's website at http://www.math.tau.ac.il/~ybenja/.Hochberg, Y. 1988. A sharper Bonferroni procedure for multiple tests of significance.

Biometrika75: 800-802.Holland, B. S. and Copenhaver, M. D. 1987. An improved sequentially rejective Bonferroni test procedure.

Biometrics43: 417-423.Holm, S. 1979. A simple sequentially rejective multiple test procedure.

Scandinavian Journal of Statistics6: 65-70.Newson, R. and the ALSPAC Study Team. 2003. Multiple-test procedures and smile plots.

The Stata Journal3(2): 109-132. Pre-publication draft downloadable from Roger Newson's website at http://www.imperial.ac.uk/nhli/r.newson.Rom, D. M. 1990. A sequentially rejective test procedure based on a modified Bonferroni inequality.

Biometrika77: 663-665.Sidak, Z. 1967. Rectangular confidence regions for the means of multivariate normal distributions.

Journal of the American Statistical Association62: 626-633.Simes, R. J. 1986. An improved Bonferroni procedure for multiple tests of significance.

Biometrika73: 751-754.Wright, S. P. 1992. Adjusted

P-values for simultaneous inference.Biometrics48: 1005-1013.

Also seeManual:

[R] by,[R] statsby,[G] graph7. On-line: help for by, statsby, postfile, graph7 help for smileplot, parmby and parmest if installed