{smcl} help {hi:wyoung} {hline} {title:Title} {p 4 4 2}{cmd:wyoung} {hline 2} Control the family-wise error rate when performing multiple hypothesis tests. {title:Syntax} {p 4 8 2}Syntax 1: multiple hypothesis testing {hline 2} one model with multiple outcomes {p 8 14 2}{cmd:wyoung} {help varlist:varlist}, {cmd:cmd(}{it:model}{cmd:)} {cmd:familyp(}{help varlist:varlist}{cmd:)} [{cmd:subgroup(}{help varname:varname}{cmd:)} {cmd:controls(}"{help varlist:varlist1}" ["{help varlist:varlist2}" ...]{cmd:)} {cmdab:r:eps(}{it:#}{cmd:)} {cmd:seed(}{it:#}{cmd:)} {cmd:permute(}{help varlist:varlist}{cmd:)} {cmdab:permutep:rogram(}{it:pgmname [, options]}{cmd:)} {cmd:strata(}{help varlist:varlist}{cmd:)} {cmd:cluster(}{help varlist:varlist}{cmd:)} {cmd:force} {cmdab:single:step} {cmd:detail} {cmd:noresampling} {cmd:familypexp} {cmd:replace}] {p 4 8 2}Syntax 2: multiple hypothesis testing {hline 2} more general but lengthier syntax for specifying different models with multiple outcomes {p 8 14 2}{cmd:wyoung}, {cmd:cmd("}{it:model1}{cmd:"} [{cmd:"}{it:model2}{cmd:"} ...]{cmd:)} {cmd:familyp("}{it:varname1}{cmd:"} [{cmd:"}{it:varname2}{cmd:"} ...]{cmd:)} [{cmdab:r:eps(}{it:#}{cmd:)} {cmd:seed(}{it:#}{cmd:)} {cmd:permute(}{help varlist:varlist}{cmd:)} {cmdab:permutep:rogram(}{it:pgmname [, options]}{cmd:)} {cmd:strata(}{help varlist:varlist}{cmd:)} {cmd:cluster(}{help varlist:varlist}{cmd:)} {cmd:force} {cmdab:single:step} {cmd:detail} {cmd:noresampling} {cmd:familypexp} {cmd:replace}] {title:Options} {p 4 8 2} {cmd:cmd(}{cmd:)}, {cmd:familyp(}{cmd:)}, {cmd:subgroup(}{cmd:)}, {cmd:controls(}{cmd:)} {p 8 8 2} Syntax 1: one model with multiple outcomes (see example 1 below) {p 12 12 2} {cmd:cmd(}{it:model}{cmd:)} specifies a single model with the multiple outcomes {help varlist:varlist}. The outcome (dependent) variable is indicated in {it:model} by "OUTCOMEVAR" (upper case). {cmd:wyoung} will estimate multiple outcome specifications by substituting each variable from {help varlist:varlist} into "OUTCOMEVAR". {p 12 12 2} {cmd:familyp(}{help varlist:varlist}{cmd:)} instructs {cmd:wyoung} to calculate adjusted {it:p}-values for the null hypotheses that the coefficients of {it: varlist} are equal to 0. {p 12 12 2} {cmd:subgroup(}{help varname:varname}{cmd:)} specifies an integer variable identifying subgroups. If {cmd:subgroup()} is specified, {cmd:wyoung} will estimate models separately for each subgroup. By default, specifying {cmd:subgroup()} will cause {cmd:wyoung} to select bootstrap/permutation samples within each subgroup, unless you specify otherwise in {cmd:strata()}. See example 4 below. {p 12 12 2} {cmd:controls(}"{help varlist:varlist1}" ["{help varlist:varlist2}" ...]{cmd:)} lets you specify different controls for each outcome. The control variables are indicated in {it:model} by "CONTROLVARS" (upper case). For the first outcome variable, {cmd:wyoung} will substitute {it:varlist1} into "CONTROLVARS", for the second outcome it will substitute {it:varlist2}, and so on. See example 7 below. {p 12 12 12} {cmd:controlsinteract(}"{help varlist:varlist1}" ["{help varlist:varlist2}" ...]{cmd:)} is an alternative to {cmd:controls()} that estimates the model separately for all pairwise combinations of outcome variables and specified controls. Each set of controls will be substituted into "CONTROLVARS" as specified in {it:model}. Specifying {it:N} different sets of controls ({it:varlist1}, {it:varlist2}, ..., {it:varlistN}) will multiply the number of hypotheses being tested by {it:N}. See example 8 below. {p 8 8 2} Syntax 2: different models with multiple outcomes (see example 2 below) {p 12 12 2} {cmd:cmd("}{it:model1}{cmd:"} [{cmd:"}{it:model2}{cmd:"} ...]{cmd:)} specifies a list of models. {p 12 12 2} {cmd:familyp("}{it:varname1}{cmd:"} [{cmd:"}{it:varname2}{cmd:"} ...]{cmd:)} instructs {cmd:wyoung} to calculate adjusted {it:p}-values for the null hypotheses that the coefficient of {it: varname1} is equal to 0 in {it: model1}, the coefficient of {it: varname2} is equal to 0 in {it: model2}, etc. If only one {it:varname} is specified, {cmd:wyoung} applies it to all {it:model}s. {p 4 8 2} {cmdab:r:eps(}{it:#}{cmd:)} perform # bootstraps/permutations for resampling; default is {cmd:reps(100)}. {p 4 8 2} {cmd:seed(}{it:#}{cmd:)} sets the random-number seed. Specifying this option is equivalent to typing the following command prior to calling {cmd:wyoung}: {phang2} {cmd:. set seed} {it:#} {p 4 8 2} {cmd:strata(}{help varlist:varlist}{cmd:)} specifies variables that identify identify strata. If {cmd:strata()} is specified, bootstrap/permutation samples are selected within each stratum. {p 4 8 2} {cmd:cluster(}{help varlist:varlist}{cmd:)} specifies variables that identify clusters. If {cmd:cluster()} is specified, the bootsrap/permutation samples are selected treating each cluster, as defined by {it:varlist}, as one unit of assignment. This option is required if {it:model} includes clustered standard errors, unless {cmd:force} is specified. See example 3 below. {p 4 8 2} {cmd:permute(}{help varlist:varlist}{cmd:)} instructs {cmd:wyoung} to permute (rerandomize) {it:varlist} instead of drawing a bootstrap sample. When {it:varlist} includes more than one variable, those variables are permuted jointly, preserving their relations to each other. {it:varlist} is not permitted to include missing values, unless {cmd:force} is specified. If {cmd:strata()} is specified, {it:varlist} is permuted within strata. If {cmd:cluster()} is specified, permutations are performed treating each cluster as one unit. {p 4 8 2} {cmdab:permutep:rogram(}{it:pgmname [, options]}{cmd:)} instructs {cmd:wyoung} to perform permutations by calling {it:pgmname}, with the {it:varlist} contents of {cmd:permute(}{it:varlist}{cmd:)} passed as the first argument and {it: options} passed as options (see example 10 below). By default, {cmd:strata()} and {cmd:cluster()} are also passed as options to {it:pgmname} if they were specified. As an example, suppose you have a program that accepts a custom string option, shuffles multiple permuted variables, and supports stratification and clustering. Specify your command and your custom option with the code {cmd:permuteprogram(myprogram, option1("myoption"))}. In your program, parse the inputs using the {help syntax:syntax} command: {p 12 12 2} {cmd:. syntax varlist [, strata(varname) cluster(varname) option1(string)]} {p 4 8 2} {cmd:force} allows the user to include a model with clustered standard errors without also specifying the {cmd:cluster()} bootstrap option, and to permute variables with missing values. {p 4 8 2} {cmdab:single:step} computes the single-step adjusted {it:p}-value in addition to the step-down value. Resampling-based single-step methods often control type III (sign) error rates. Whether their step-down counterparts also control the type III error rate is unknown (Westfall and Young 1993, p. 51). {p 4 8 2} {cmd:detail} produces sample size statistics for the bootstrap/permutation samples. {p 4 8 2} {cmd:noresampling} computes only the Bonferroni-Holm and Sidak-Holm adjusted {it:p}-values (very fast). {p 4 8 2} {cmd:familypexp} indicates that you are providing {cmd:familyp(}{help exp:exp}{cmd:)} instead of {cmd:familyp(}{help varlist:varlist}{cmd:)} when employing Syntax 1, where {help exp:exp} specifies a coefficient or combination of coefficients. {help exp:exp} follows the syntax of {help lincom:lincom} and {help nlcom:nlcom} and must not contain an equal sign. If employing Syntax 2, then {cmd:familypexp} indicates that you are providing {cmd:familyp("}{it:exp1}{cmd:"} [{cmd:"}{it:exp2}{cmd:"} ...]{cmd:)} instead of {cmd:familyp("}{it:varname1}{cmd:"} [{cmd:"}{it:varname2}{cmd:"} ...]{cmd:)}. Specifying {cmd:familypexp} increases the set of possible hypothesis tests, but may cause {cmd:wyoung} to produce less helpful error messages when you make a syntax mistake. {p 4 8 2} {cmd:replace} replaces data in memory with {cmd:wyoung} results. {title:Description} {p 4 4 2}{cmd:wyoung} controls the family-wise error rate using the free step-down resampling methodology of Westfall and Young (1993). This method leverages resampling techniques, such as bootstrapping (sampling with replacement) or permutation (shuffling), to adjust the standard {it:p}-values obtained from model estimation. It also computes the Bonferroni-Holm and Sidak-Holm adjusted {it:p}-values. {p 4 4 2}The family-wise error rate (FWER) is the probability of rejecting at least one true null hypothesis---commonly referred to as making "false discovery"---within a "family" of hypotheses. A procedure is said to provide {it:strong control} of the FWER if it maintains the error rate at or below a specified level regardless of how many of the hypotheses are true. In contrast, {it:weak control} of the FWER applies only under the assumption that all hypotheses are true, i.e., when the complete null hypothesis holds. {p 4 4 2}The Westfall-Young resampling algorithm provides strong control of the FWER under the condition of subset pivotality, a multivariate generalization of pivotality. Subset pivotality requires that the joint distribution of any subvector of {it:p}-values remains unaffected by the truth or falsehood of hypotheses corresponding to {it:p}-values not included in the subvector. This condition is satisfied in many settings, including significance testing for coefficients in a general multivariate regression model with possibly non-normal or heteroskedastic errors. {title:Methods and formulas} {p 4 4 2}The free step-down resampling method implemented in {cmd:wyoung} follows Algorithm 2.8 of Westfall and Young (1993). The single-step resampling method, available via the {cmd:singlestep} option, follows Algorithm 2.5 of Westfall and Young (1993). Detailed documentation, including simulation results, can be found online at {browse "https://reifjulian.github.io/wyoung/documentation/wyoung.pdf":https://reifjulian.github.io/wyoung/documentation/wyoung.pdf}. {p 4 4 2}The Bonferroni-Holm and Sidak-Holm step-down {it:p}-values are calculated as follows. Sort the {it:J} unadjusted {it:p}-values so that {it:p(1)