{smcl}
{* *! version 4.03  David Fisher  12oct2022}{...}
{vieweralsosee "ipdover" "help ipdover"}{...}
{vieweralsosee "forestplot" "help forestplot"}{...}
{vieweralsosee "metan" "help metan"}{...}
{vieweralsosee "metani" "help metani"}{...}
{viewerjumpto "Syntax" "ipdmetan##syntax"}{...}
{viewerjumpto "Description" "ipdmetan##description"}{...}
{viewerjumpto "Options" "ipdmetan##options"}{...}
{viewerjumpto "Saved results" "ipdmetan##saved_results"}{...}
{viewerjumpto "Examples" "ipdmetan##examples"}{...}
{viewerjumpto "References" "ipdmetan##references"}{...}
{title:Title}

{phang}
{cmd:ipdmetan} {hline 2} Perform two-stage individual participant data (IPD) meta-analysis


{marker syntax}{...}
{title:Syntax}

{phang}
Syntax 1: {it:command}-based syntax; "generic" effect measure

{p 8 18 2}
{cmd:ipdmetan}
	[{it:{help exp_list}}]
	{cmd:, {ul:s}tudy(}{it:varname} [{cmd:, {ul:m}issing}]{cmd:)} [{it:options}] {cmd::} {it:command} {ifin} {it:...}

{phang}
Syntax 2: {bf:{help collapse}}-based syntax; "specific" effect measure

{p 8 18 2}
{cmd:ipdmetan}
	{it:input_varlist} {ifin}
	{cmd:, {ul:s}tudy(}{it:varname} [{cmd:, {ul:m}issing}]{cmd:)} [{it:options}]

{pstd}
where {it:input_varlist} is one of the following:

{p 8 34 2}{it:var_outcome} {it:var_treat}{space 11}where {it:var_outcome} and {it:var_treat} are both binary 0, 1{p_end}
{p 8 34 2}{it:var_outcome} {it:var_treat}{space 11}where {it:var_outcome} is continuous and {it:var_treat} is binary 0, 1{p_end}
{p 8 34 2}{it:var_treat}{space 23}where {it:var_treat} is binary 0, 1 and the data have previously been {bf:{help stset}}.{p_end}


{pstd}
The terms "generic" and "specific" are used to differentiate between derivation of effect sizes and standard errors which could,
in the inverse-variance meta-analysis context, be interpreted generically (although in practice, of course, the interpretation is governed by {it:command});
and the use of {bf:{help collapse}} to directly convert the IPD to an aggregate dataset with a specific data structure
such as a 2x2 contingency table, or means and SDs by treatment arm.
There is a substantial difference in syntax; in the remainder of the {cmd:ipdmetan} documentation the terms "Syntax 1" and "Syntax 2" will be used.


{synoptset 34 tabbed}{...}
{synopthdr}
{synoptline}
{syntab :Main}
{synopt :{it:{help metan##options:options}}}any {bf:{help metan}} options, as appropriate, except {opt npts(varname)}{p_end}

{syntab :Syntax 1 only}
{synopt :{opt me:ssages}}print messages relating to success of model fits{p_end}
{synopt :{opt inter:action}}automatically identify and pool a treatment-covariate interaction{p_end}
{synopt :{opt pool:var(model_coefficient)}}specify explicitly the coefficient to pool{p_end}
{synopt :{opt notot:al}}suppress initial fitting of {it:command} to the entire dataset{p_end}
{synopt :{opt wgt(exp)}}user-defined weights{p_end}

{syntab :Syntax 2 only}
{synopt :{cmd:wgt(}[{opt (stat)}] {it:varname}{cmd:)}}user-defined weights{p_end}

{syntab :Syntax 2 with {opt logrank} only}
{synopt :{opt st:rata}}specify further variables by which to stratify the log-rank calculations{p_end}

{syntab :Combined IPD/aggregate data analysis}
{synopt :{cmd:ad(}{it:{help filename}} {ifin}{cmd:,} {help ipdmetan##aggregate_data_options:{it:aggregate_data_options}}{cmd:)}}
combine IPD with aggregate data stored in {it:filename}{p_end}

{syntab :Forest plots}
{synopt :{cmdab:lcol:s(}{help ipdmetan##cols_info:{it:cols_info}}{cmd:)} {cmdab:rcol:s(}{help ipdmetan##cols_info:{it:cols_info}}{cmd:)}}
display (and/or save) columns of additional data{p_end}
{synopt :{cmd:npts}}display participant numbers in the forest plot{p_end}
{synopt :{cmd:plotid(}{it:varname}{cmd:|_BYAD} [{cmd:, {ul:l}ist {ul:nogr}aph}]{cmd:)}}
define groups of observations in which to apply specific plot rendition options{p_end}
{synopt :{it:{help metan##options_main:metan_fplotopts}}}other options pertaining to the forest plot
as described in {bf:{help metan}} under "Forest plot and/or saved data"{p_end}
{synopt :{cmdab:forest:plot(}{help forestplot##options:{it:forestplot_options}}{cmd:)}}other options as described in {bf:{help forestplot}}{p_end}
{synoptline}

{pstd}
where {it:model_coefficient} is a variable name, a level indicator, an interaction indicator,
or an interaction involving continuous variables (c.f. syntax of {help test})

{marker cols_info}{...}
{pstd}
and where {it:cols_info} has the following syntax, which is based on that of {bf:{help collapse}}:

{pmore}
[{opt (stat)}] [{it:newname}=]{it:item} [{it:%fmt}] [{cmd:"}{it:label}{cmd:"}] [[{it:newname}=]{it:item} [{it:%fmt} {cmd:"}{it:label}{cmd:"}] ] {it:...} [ [{opt (stat)}] {it:...}]

{pmore}
where {it:stat} is as defined in {bf:{help collapse}};
{it:newname} is an optional user-specified variable name;
{it:item} is the name of either a numeric returned quantity from {it:command} (in parentheses, see {it:{help exp_list}})
or a variable currently in memory; {it:%fmt} is an optional {help format}; and {cmd:"}{it:label}{cmd:"} is an optional variable label.

{marker aggregate_data_options}{...}
{synopthdr :aggregate_data_options}
{synoptline}
{synopt :{opt vars(varlist)}}variables containing effect size and either standard error or 95% confidence limits, on the normal scale{p_end}
{synopt :{opt npts(varname)}}variable containing participant numbers{p_end}
{synopt :{opt byad}}IPD and aggregate data are to be treated as subgroups (rather than as a single set of estimates){p_end}
{synopt :{opt logr:ank}}specify that {opt vars()} are to be interpreted as {it:O-E} and {it:V}{p_end}
{synopt :{opt rel:abel}}force relabelling of studies within the combined IPD/aggregate dataset{p_end}
{synoptline}

{p2colreset}{...}
{p 4 6 2}

{marker description}{...}
{title:Description}

{pstd}
{cmd:ipdmetan} performs two-stage individual participant-data (IPD) meta-analysis.  There are two basic syntaxes, as shown above.

{pstd}
Syntax 1 fits the model {it:command} once within each level of {it:study_ID}
and saves effect sizes and standard errors. By default these are pooled using inverse-variance, with output displayed on screen and in a forest plot.
Any e-class regression command (whether built-in or user-defined) should be compatible with this syntax of {cmd:ipdmetan}.
Some {help prefix} commands are also valid; see below.

{pmore}
In the case of non e-class commands - those which do not change the contents of {cmd:e(b)} -,
the effect size and standard error statistics to be collected from the execution of {it:command}
must be specified manually by supplying {it:{help exp_list}}.
If {it:command} changes the contents in {cmd:e(b)}, {it:exp_list} defaults to
{cmd:_b[}{it:varname}{cmd:]} {cmd:_se[}{it:varname}{cmd:]},
where {it:varname} is the first independent variable within {it:command}.

{pmore}
The following {help prefix} commands are valid with {cmd:ipdmetan}: {cmd:bootstrap}; {cmd:jackknife}; {cmd:svy}; {cmd:mi estimate, post}; {cmd:bayes}; {cmd:version}.

{pstd}
Syntax 2 converts the IPD to aggregate data using {bf:{help collapse}};
after which any of the analysis methods used by {bf:{help metan}},
such as Mantel-Haenszel and Standardised Weighted Means, become applicable.

{pmore}
There are three ways in which {cmd:ipdmetan} Syntax 2 can convert IPD to aggregate data, as listed under {help ipdmetan##syntax:Syntax}.
In the first case (binary outcome; binary treatment variable), the data will be summarised by study as cell counts from a 2x2 contingency table.
In the second case (continuous outcome; binary treatment variable), the data will be summarised by study as means and SDs by treatment arm.
Finally, if a binary treatment variable alone is supplied and the data is {bf:{help stset}},
the survival data will be summarised by study using Peto logrank {it:O-E} and {it:V} statistics
(note that this supersedes {cmd:petometan} which previously formed part of the {cmd:ipdmetan} package).

{pmore}
Note that with Syntax 2, the effect measure must be made explicit by use of an option such as {opt rr}, {opt or}, {opt hr}, {opt smd} or {opt logrank};
see {bf:{help metan}} and {it:{help eform_option}}.

{pstd}
This version of {cmd:ipdmetan} requires version 4.0+ of the package {cmd:metan} to be installed.
This package is available from the SSC archive; type {stata ssc describe metan}.


{marker options}{...}
{title:Options}

{dlgtab:Main}

{phang}
{cmd:study(}{it:study_ID} [{cmd:, missing}]{cmd:)} (required) specifies the variable containing the study identifier,
which must be either integer-valued or string.

{pmore}
{opt missing} requests that missing values be treated as potential study identifiers; the default is to exclude them.

{phang}
{opt interaction} (Syntax 1 only) indicates that {it:command} contains one or more interaction effects
supplied using factor-variable syntax (see {help fvvarlist}),
and that the first valid interaction effect should be pooled across studies.
This is intended as a helpful shortcut for performing two-stage "deft" interaction analyses as described in {help ipdmetan##references:Fisher 2017}.
However, it is not foolproof, and the identified coefficient should be checked carefully.
Alternatively, the desired coefficient to be pooled may be supplied directly using {opt poolvar()}.

{pmore}
Note that {cmd:forestplot(interaction} [...] {cmd:)} works slightly differently.
Valid with both Syntax 1 and Syntax 2, this option requests that {cmd:forestplot} apply the set of plot-related options appropriate for interactions;
that is, using circles in place of squares and diamonds.
This may be useful in situations where the pooled effect is to be interpreted as an interaction,
but {it:command} is not formulated such that a standard interaction coefficient is present (or if {opt poolvar()} is used; see above).

{pmore}
{opt interaction} implies {cmd:forestplot(interaction} [...] {cmd:)}; there is no need to specify {opt interaction} twice.

{phang}
{opt messages} (Syntax 1 only) requests that information is printed to screen regarding whether effect size and standard error statistics
have been successfully obtained from each study, and (if applicable) whether the iterative random-effects calculations
converged successfully.

{phang}
{opt nototal} (Syntax 1 only) requests that {it:command} not be fitted within the entire dataset, e.g. for time-saving reasons.
By default, such fitting is done to check for problems in convergence and in the validity of requested coefficients and
returned expressions. If {opt nototal} is specified, either {opt poolvar()} or {it:exp_list} must be supplied,
and a message appears above the table of results warning that estimates should be double-checked by the user.

{phang}
{opt poolvar(model_coefficient)} (Syntax 1 only) allows the coefficient to be pooled to be explicitly stated in situations where it may not be obvious,
or where {cmd:ipdmetan} has made a previous incorrect assumption. {it:model_coefficient} should be a variable name,
a level indicator, an interaction indicator, or an interaction involving continuous variables (c.f. syntax of {help test}).
To use equations, use the format {cmd:poolvar(}{it:eqname}{cmd::}{it:varname}{cmd:)}.

{phang}
{opt strata(varlist)} (Syntax 2 with {opt logrank} only) specifies further variables to be used in log-rank calculations
but not be presented in the output.

{phang}
{opt wgt()} specifies user-defined weights. With Syntax 1, {opt wgt()} expects an expression involving returned statistics.
For example, to weight on the number of observations, you might specify {bf:wgt(e(N))}.
With Syntax 2, {opt wgt()} expects a {bf:{help collapse}}-based syntax.
Hence, again, to weight on the number of observations, you might specify {bf:wgt((sum) cons)} where {bf:cons} is a variable containing 1 for all observations. You should only use this option if you are satisfied that the weights are meaningful.

{pmore}
Note that the scale of user-defined weights is immaterial, since individual weights are normalised.
Hence, if {opt saving()} option is used, an analysis may be recreated from within the saved dataset
using the option {cmd:wgt(_WT)}.


{dlgtab:Combined IPD/aggregate data analysis}

{phang}
{cmd:ad(}[{it:{help filename}}] {ifin}{cmd:,} {it:aggregate_data_options}{cmd:)}
allows aggregate (summary) data may be included in the analysis alongside IPD, for example if some studies do not have IPD available.

{phang}
{it:aggregate_data_options} are as follows:

{pmore}
{opt vars(varlist)} contains the names of variables containing the effect size
and either a standard error or lower and upper 95% confidence limits, on the linear scale.
If {it:{help filename}} is supplied, {it:varlist} will be taken from within the external file;
otherwise {it:varlist} will be taken from the data currently in memory.
If confidence limits are supplied, they must be derived from a Normal distribution or the pooled result will not be accurate (see {bf:{help metan}}).

{pmore}
{opt npts(varname)} allows participant numbers stored in {it:varname} within {it:filename} to be displayed in tables and forest plots.

{pmore}
{opt byad} specifies that aggregate data and IPD respectively are to be treated as subgroups.

{pmore}
{opt logrank} specifies that {opt vars(varlist)} contains the statistics {it:O-E} and {it:V}
rather than the default {it:ES} and {it:seES}.

{pmore}
{opt relabel} tells {cmd:ipdmetan} to re-label all studies sequentially as "1", "2", etc.,
(copying value labels across, if found) in the event that value labels do not agree between IPD and aggregate datasets.

{pmore}
Note that subgroups in aggregate data may be analysed in the same way as for IPD - that is, with the {opt by(varname)} option to {cmd:ipdmetan}.
{it:varname} may be found in either the data in memory (IPD), or in the aggregate dataset, or both.


{dlgtab:Forest plots}

{phang}
{cmd:lcols(}{help ipdmetan##cols_info:{it:cols_info}}{cmd:)}, {cmd:rcols(}{help ipdmetan##cols_info:{it:cols_info}}{cmd:)}
define columns of additional summary data to be presented to the left or right of the forest plot.
With {cmd:metan} these options simply require a {it:varlist}, but in the IPD context the syntax is more complicated.
The user may specify summary statistics by which to {bf:{help collapse}} the data,
as well as characteristics of the summarised variables such as name, title and format which will be carried over to the forest plot.

{pmore}
Specifying {it:newname} is only necessary in circumstances where the name of the variable is important
in the dataset underlying the forest plot (i.e. the dataset created by {opt saving()} if applicable).
For example, you may have an aggregate dataset with a variable containing data equivalent to an {it:item},
and wish for all such data (whether IPD or aggregate) to appear in a single column in the forest plot.
To achieve this, specify {it:newname} as the name of the relevant variable in the aggregate dataset.
Make sure that the variables in the IPD and aggregate datasets do not have conflicting {help data_types} (e.g. string vs numeric)
or a {bf:{help merge}} error will be returned.

{pmore}
If {it:item} is an existing string variable, the first non-empty observation for each study will be used,
and {it:item} will not be displayed alongside overall or subgroup pooled estimates.
(This behaviour may also be forced upon a numeric variable by first converting it into a string using {bf:{help recode}} or {bf:{help tostring}}.)

{pmore}
Formatting of forest plot columns may be controlled using {help ipdmetan##cols_info:{it:cols_info}}.
Note that unlike {opt (stat)}, {it:%fmt} only applies to each immediately preceding {it:item}.
By default, Stata displays strings as right-justified.
A single string-valued forest plot column may be left-justified as described in {help format}.
To left-justify {ul:all} strings in the forest plot, the {help forestplot##options:{it:forestplot_option}} {opt leftjustify} has been provided.

{pmore}
(Note that the syntax of {opt lcols()} and {opt rcols()} with {bf:{help metan}} and {bf:{help forestplot}} is as a list of existing variable names only.)

{phang}
{cmd:npts} requests that participant numbers be displayed in a column to the left of the forest plot.
This is effectively shorthand for {cmd:lcols(}{it:cols_info_defining_participant_numbers}{cmd:)}.

{phang}
{cmd:plotid(}{it:varname}{cmd:|_BYAD} [{cmd:, list nograph}]{cmd:)} is really a {bf:{help forestplot}} option, but has a slightly
extended syntax when supplied to {cmd:ipdmetan}.  {it:varname} may be replaced with {cmd:_BYAD} if the {opt byad} suboption
is supplied to {opt ad}, since in this case the subgrouping is not defined by an existing variable.

{pmore}
For further details of this option and the {opt list} and {opt nograph} suboptions, see {bf:{help forestplot}}.


{marker saved_results}{...}
{title:Saved results}

{pstd}{cmd:ipdmetan} saves the same results in {cmd:r()} as {bf:{help metan}}, with the following additions:{p_end}

{synoptset 25 tabbed}{...}
{p2col 5 25 29 2: Macros}{p_end}
{synopt:{cmd:r(command)}}Full estimation command-line{p_end}
{synopt:{cmd:r(cmdname)}}Estimation command name{p_end}
{synopt:{cmd:r(estexp)}}Name of pooled coefficient{p_end}

{synoptset 25 tabbed}{...}
{p2col 5 25 29 2: Matrices}{p_end}
{synopt:{cmd:r(coeffs)}}Matrix of study and subgroup identifiers, effect coefficients, numbers of participants, and weights{p_end}

{synoptset 25 tabbed}{...}
{p2col 5 25 29 2: Variables}{p_end}
{synopt:{cmd:_rsample}}Observations included in the analysis (c.f. {cmd:e(sample)}){p_end}

{pstd}
N.B. For obvious reasons, {help metan##saved_datasets:new variables} {bf:_ES}, {bf:_seES} etc. are {ul:not} added to the data with {cmd:ipdmetan}.
They are instead returned within the matrix {cmd:r(coeffs)}.


{marker examples}{...}
{title:Examples}

{pstd}
Setup

{cmd}{...}
{* example_start - ipdmetan_setup1}{...}
{phang2}
. use "http://fmwww.bc.edu/repec/bocode/i/ipdmetan_example.dta", clear{p_end}
{phang2}
. stset tcens, fail(fail){p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_setup1 using ipdmetan.sthlp, restnot:click to run})}{p_end}


{pstd}
Basic use

{phang2}
{cmd:. {stata "ipdmetan, study(trialid) hr by(region) nograph : stcox trt, strata(sex)"}}{p_end}


{pstd}
Use of {cmd:plotid()}

{cmd}{...}
{* example_start - ipdmetan_ex2}{...}
{phang2}
. ipdmetan, study(trialid) hr by(region) plotid(region){* ///}{p_end}
{p 16 20 2}
forest(favours(Favours treatment # Favours control) box1opts(mcolor(red)) ci1opts(lcolor(red) rcap){* ///}{...}
box2opts(mcolor(blue)) ci2opts(lcolor(blue))){* ///}{p_end}
{p 16 20 2}
: stcox trt, strata(sex){p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_ex2 using ipdmetan.sthlp, restpres:click to run})}{p_end}


{pstd}
Treatment-covariate interactions

{cmd}{...}
{* example_start - ipdmetan_ex3}{...}
{phang2}
. ipdmetan, study(trialid) interaction hr keepall{* ///}{p_end}
{p 16 20 2}
forest(boxsca(200) fp(3){* ///}{p_end}
{p 16 20 2}
favours("Favours greater treatment effect" "with higher disease stage"{* ///}{...}
# "Favours greater treatment effect" "with lower disease stage")){* ///}{p_end}
{p 16 20 2}
: stcox trt##c.stage{p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_ex3 using ipdmetan.sthlp, restpres:click to run})}{p_end}


{pstd}
Aggregate data setup: create aggregate dataset from IPD dataset (for example purposes only)

{cmd}{...}
{* example_start - ipdmetan_setup2}{...}
{phang2}
. qui ipdmetan, study(trialid) hr nograph saving(region2.dta) : stcox trt if region==2, strata(sex){p_end}
{phang2}
. clonevar _STUDY = trialid{p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_setup2 using ipdmetan.sthlp, restpresnot:click to run})}{p_end}


{pstd}
Including aggregate data in the analysis

{cmd}{...}
{* example_start - ipdmetan_ex4}{...}
{phang2}
. ipdmetan, study(_STUDY) hr ad(region2.dta if _USE==1, vars(_ES _seES) npts(_NN) byad) nooverall{* ///}{p_end}
{p 16 20 2}
: stcox trt if region==1, strata(sex){p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_ex4 using ipdmetan.sthlp, restpres:click to run})}{p_end}


{pstd}
Use of non e-class commands and {opt lcols()}: Peto log-rank analysis

{cmd}{...}
{* example_start - ipdmetan_ex5}{...}
{phang2}
. ipdmetan (u[1,1]/V[1,1]) (1/sqrt(V[1,1])), study(trialid) by(region) eform effect(Haz. Ratio){* ///}{p_end}
{p 16 20 2}
lcols((u[1,1]) %5.2f "o-E(o)" (V[1,1]) %5.2f "V(o)"){* ///}{p_end}
{p 16 20 2}
forest(nostats nowt favours(Favours treatment # Favours control)){* ///}{p_end}
{p 16 20 2}
: sts test trt, mat(u V){p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_ex5 using ipdmetan.sthlp, restpres:click to run})}{p_end}


{pstd}
However, that was just to demonstrate Syntax 1 with a non e-class command.
The example is a Peto logrank survival analysis, which is much more straightforward using Syntax 2 and the {opt oev} option:

{cmd}{...}
{* example_start - ipdmetan_ex6}{...}
{phang2}
. ipdmetan trt, study(trialid) hr oev by(region){* ///}{p_end}
{p 16 20 2}
forest(nostats nowt favours(Favours treatment # Favours control)){p_end}
{* example_end}{...}
{txt}{...}
{pmore}
{it:({stata metan_hlp_run ipdmetan_ex6 using ipdmetan.sthlp, restpres:click to run})}{p_end}



{title:Author}

{pstd}
David Fisher, MRC Clinical Trials Unit at UCL, London, UK.{p_end}

{pstd}
Email {browse "mailto:d.fisher@ucl.ac.uk":d.fisher@ucl.ac.uk}{p_end}



{title:Acknowledgments}

{pstd}
Thanks to Phil Jones at UWO, Canada for suggesting improvements in functionality.

{pstd}
The "click to run" element of the examples in this document is handled using an idea originally developed by Robert Picard.



{marker references}{...}
{title:References}

{phang}Fisher DJ. 2015. Two-stage individual participant data meta-analysis and generalised forest plots.
Stata Journal 15: 369-96{p_end}

{phang}Fisher DJ, Carpenter JR, Morris TP, Freeman SC, Tierney JF. 2017.
Meta-analytical methods to identify who benefits most from treatments: daft, deluded, or deft approach?
BMJ 356: j573{p_end}