{smcl}
{* *! version 1.0.2 12Feb2024}{...}
{* *! version 1.0.1 23Jan2024}{...}
{* *! version 1.0.0 28Oct2023}{...}
{title:Title}
{p2colset 5 19 20 2}{...}
{p2col:{hi:funnelinst} {hline 2}} Funnel plot for comparing institutional performance {p_end}
{p2colreset}{...}
{marker syntax}{...}
{title:Syntax}
{pstd}
funnelinst for a proportion:
{p 8 14 2}
{cmd:funnelinst} {opt prop}
{it:{help varname:num1}}
{it:{help varname:den1}}
[{it:{help varname:num2}}
{it:{help varname:den2}}]
{ifin}
[{cmd:,} {it:options}]
{pstd}
funnelinst for a rate:
{p 8 14 2}
{cmd:funnelinst} {opt rate}
{it:{help varname:num1}}
{it:{help varname:den1}}
[{it:{help varname:num2}}
{it:{help varname:den2}}]
{ifin}
[{cmd:,} {it:options}]
{pstd}
funnelinst for an indirect standardized rate:
{p 8 14 2}
{cmd:funnelinst} {opt smr}
{it:{help varname:num1}}
{it:{help varname:den1}}
[{it:{help varname:num2}}
{it:{help varname:den2}}]
{ifin}
[{cmd:,} {it:options}]
{pstd}
funnelinst for a continuous variable:
{p 8 14 2}
{cmd:funnelinst} {opt mean}
{it:{help varname:num1}}
{it:{help varname:den1}}
[{it:{help varname:num2}}
{it:{help varname:den2}}]
{ifin}
[{cmd:,} {it:options}]
{pstd}
In the syntax for {cmd:funnelinst} {opt prop} and {opt rate}, the {it:numerator} is the {ul:number} of events and the {it:denominator} is the population size.{p_end}
{pstd}In the syntax for {cmd:funnelinst} {opt smr}, the {it:numerator} is the observed {ul:number} of events and the {it:denominator} is the predicted {ul:number} of events.{p_end}
{pstd}In the syntax for {cmd:funnelinst} {opt mean}, the {it:numerator} is the {ul:mean} and the {it:denominator} is the {ul:standard error}.{p_end}
{pstd} For all subcommands, when only two variables are specified (numerator1, denominator1), {cmd:funnelinst} analyzes the performance cross-sectionally.
When four variables are specified (numerator1, denominator1, numerator2, denominator2), {cmd:funnelinst} analyzes the change in the performance indicator
between two timepoints (longitudinal). {p_end}
{synoptset 26 tabbed}{...}
{synopthdr}
{synoptline}
{synopt :{opt tar:get(numlist)}}target rate with a maximum of {cmd:two} values; default is to use the computed overall average rate {p_end}
{synopt :{opt cht:ype(string)}}change type for {cmd:prop} and {cmd:rate} subcommands. Available options are "diff" (difference), "ratio" (ratio),
and "or" (odds ratio) {p_end}
{synopt:{opt npo:ints(#)}}number of points to use for constructing the funnel plot denominator; default is {cmd: npoints(200)} {p_end}
{synopt:{opt xr:angehigh(#)}}highest value to use in constructing the funnel plot denominator; default is 1% higher than the largest {it:denominator} value {p_end}
{synopt:{opt log:trans}}log transformation of the performance measure {it:Y} (numerator/denominator); available for subcommands {cmd:prop}, {cmd:rate} and {cmd:smr} {p_end}
{synopt:{opt arc:sintrans}}arc sin transformation of the performance measure {it:Y} (numerator/denominator); available for subcommands {cmd:prop} and {cmd:rate} {p_end}
{synopt:{opt sqr:ttrans}}square root tranformation of the performance measure {it:Y} (numerator/denominator); available for subcommand {cmd:smr} {p_end}
{synopt :{opt pv:al(numlist)}}p-value(s) used as the basis for generating confidence intervals (up to a maximum of {cmd:three}); default is {cmd: pval(0.001 0.025)} {p_end}
{synopt :{opt madj:ust}}adjusts the comparisonwise error rate to account for multiple testing; {opt madj:ust} overrides p-values specified in {cmd:pval} {p_end}
{synopt:{opt ex:act}}perform exact estimation; available for subcommands {cmd:prop}, {cmd:rate} and {cmd:smr} when two variables are specified, and {cmd:smr} when four
variables are specified {p_end}
{synopt:{opt win:sor(#)}}percent of observations to be winsorized; default is no winsorizing {cmd: winsor(0)} {p_end}
{synopt:{opt over:disp(string)}}adjust for overdispersion; options are additive {cmd:overdisp(add)} and multiplicative {cmd:overdisp(mult)} {p_end}
{synopt:{opt yper:cent}}multiplies performance measure by 100; available for subcommands {cmd:prop} and {cmd:smr} {p_end}
{synopt:{opt rate:denom}}multiplies y-axis values by specified amount (e.g. 1000) and divides x-axis values by that specified amount; available for subcommand {cmd:rate} {p_end}
{synopt:{opt ymi:n(#)}}presents observations and CI values in the figure above specified minimum{p_end}
{synopt:{opt yma:x(#)}}presents observations and CI values in the figure below specified maximum{p_end}
{synopt:{opt sav:ing}{cmd:(}{it:{help filename:filename}}{cmd:)}}specify the filename where the CIs, generated by {cmd:funnelinst}, are to be saved.
The suboption "replace" is allowed{p_end}
{synopt:{opt fig:ure}[{cmd:(}{it:{help twoway_options:twoway_options}}{cmd:)]}}allows all {helpb twoway} options to enhance the basic figure {p_end}
{synoptline}
{marker description}{...}
{title:Description}
{pstd}
{opt funnelinst} generates funnel plots for comparing units (e.g. physicians, hospitals, etc.) on a given performance measure (e.g. readmission, mortality, etc.)
either cross-sectionally or longitudinally (i.e. the change between two timepoints), as described in Spiegelhalter (2005a, 2005b, 2012). The funnel plot may
be thought of as a control chart in which units that fall outside of the confidence limits be considered "out of statistical control". A particularly salient
feature of {opt funnelinst} is an option to account for over-dispersion of the performance measure due to unmeasured risk factors, which could lead to a large
number of units being inappropriately classified as out of control. {opt funnelinst} further offers adjustment of the comparisonwise error rate to account for
multiple testing using both the Bonferroni and False Discovery Rate methods (Jones et al. 2008).
{title:Options}
{p 4 8 2}
{cmd:target(}{it:numlist}{cmd:)} target rate for the performance indicator {it:Y} (numerator/denominator). When two values are specified (the maximum allowed),
{opt funnelinst} treats the target values as a range. The default is to use the computed overall average rate of the performance indicator.
{p 4 8 2}
{cmd:chtype(}{it:string}{cmd:)} type of analysis to evaluate change for {cmd:prop} and {cmd:rate} subcommands. Available options are "diff" (difference), "ratio" (ratio),
and "or" (odds ratio). Only one change type is available for subcommands {cmd:smr} and {cmd:mean}, therefore {cmd:chtype(}{it:string}{cmd:)} is not required
for those subcommands. {cmd:chtype(}{it:string}{cmd:)} can only be specified when two numerators and two denominators are specified.
{p 4 8 2}
{cmd:npoints(}{it:#}{cmd:)} specifies the number of points to use for constructing the funnel plot denominator (x-axis); default is {cmd: npoints(200)}.
{p 4 8 2}
{cmd:xrangehigh(}{it:#}{cmd:)} specifies the highest value to use in constructing the funnel plot denominator (x-axis); default is 1% higher than the
largest {it:denominator} value.
{p 4 8 2}
{cmd:logtrans} log transforms the performance measure {it:Y} for computing control limits. However, the funnel plot axes are presented on the original scale.
{cmd:logtrans} is available for subcommands {cmd:prop}, {cmd:rate} and {cmd:smr}.
{p 4 8 2}
{cmd:arcsintrans} arc sin transforms the performance measure {it:Y} for computing control limits. However, the funnel plot axes are presented on the original scale.
{cmd:arcsintrans} is available for subcommands {cmd:prop} and {cmd:rate}.
{p 4 8 2}
{cmd:sqrttrans} square-root transforms the performance measure {it:Y} for computing control limits. However, the funnel plot axes are presented on the original scale.
{cmd:sqrttrans} is available for the subcommand {cmd:smr}.
{p 4 8 2}
{cmd:pval(}{it:#}{cmd:)} p-values to be used as the basis for generating confidence intervals. Up to three p-values may be specified. The default is {cmd: pval(0.001 0.025)}
which are roughly equivalent to 2 and 3 standard deviations.
{p 4 8 2}
{cmd:madjust} uses both the Bonferroni and False Discovery Rate (FDR) methods to adjust the comparisonwise error rate to account for multiple testing. The Bonferroni method
is extremely conservative and becomes even more so as sample size increases (given that the comparisonwise error rate = 0.05 / N). Conversely, the FDR method uses a step-down
approach which is likely to identify more outlier institutions (see Jones et al. 2008). An FDR of 5% means that among all institutions called significant, 5% of these are
truly null on average (Storey and Tibshirani, 2003). {cmd:madjust} will override any {cmd:pval(}{it:#}{cmd:)} that the user may have specified. The graph will present
Bonferroni and FDR adjusted 95% confidence intervals as well as a one-sided unadjusted 95% CI.
{p 4 8 2}
{cmd:exact} performs exact estimation, rather than the default normal approximation. Available for subcommands {cmd:prop}, {cmd:rate} and {cmd:smr} for cross-sectional analyses,
and {cmd:smr} for longitudinal analyses.
{p 4 8 2}
{cmd:winsor(}{it:#}{cmd:)} indicates the percent of the observations to be winsorized in each tail. More specifically, all values above (below) the percent specified are
replaced by the value immediately below (above) the percent specified. {cmd: winsor(#)} must be less than 50. The default is no winsorizing.
{p 4 8 2}
{cmd:overdisp(}{it:string}{cmd:)} adjusts for overdispersion of the performance measure {it:Y}. The two options are additive {cmd:overdisp(add)}
and multiplicative {cmd:overdisp(mult)}.
{p 4 8 2}
{cmd:ypercent} multiplies y-axis values by 100 to present the values on a whole number scale. {cmd:ypercent} is available
for subcommands {cmd:prop} and {cmd:smr}
{p 4 8 2}
{cmd:ratedenom(}{it:#}{cmd:)} multiplies y-axis values by the specified amount (e.g. {cmd:ratedenom(1000)}) and divides x-axis values by that specified amount, indicating
that the values represent a rate. {cmd:ratedenom(#)} is available for subcommand {cmd:rate}.
{p 4 8 2}
{cmd:ymin(}{it:#}{cmd:)} presents observations and CI values only above the specified minimum value. For example, measures that cannot be negative (e.g. proportions and rates)
may be truncated at zero rather than showing lower CI values which may be negative.
{p 4 8 2}
{cmd:ymax(}{it:#}{cmd:)} presents observations and CI values only below the specified maximum value.
{p 4 8 2}
{cmd:figure}[{cmd:(}{it:{help twoway_options:twoway_options}}{cmd:)}] allows the user to modify the default graph settings using all available {helpb twoway} options.
{title:Examples}
{pstd}
{opt 1) funnelinst for a proportion:}{p_end}
{pmore} Set-up {p_end}
{pmore2}{cmd:. use cabg.dta, clear} {p_end}
{pmore} Data are 30-day mortality rates following coronary artery bypass grafts in New York state, 1997–1999, for 33 hospitals. The numerator here is
the unadjusted number of deaths and the denominator is the volume of cases. We specify that the maximum value on the x-axis is 2000, the minimum value
on the y-axis is 0 and the maximum value on the y-axis is 8, and that values on the y-axis be multiplied by 100 to present percentages as whole numbers. {p_end}
{pmore2}{cmd:. funnelinst prop deaths cases, xrangehigh(2000) ymin(0) ymax(8) ypercent}
{pmore} same as above but compute exact statistics instead of normal approximation, and add Y and X titles. {p_end}
{pmore2}{cmd:. funnelinst prop deaths cases, xrangehigh(2000) ymin(0) ymax(8) ypercent exact figure(ytitle(% mortality) xtitle(Volume of cases))}
{pmore} same as above but add marker labels to identify outliers. {p_end}
{pmore2}{cmd:. funnelinst prop deaths cases, xrangehigh(2000) ymin(0) ymax(8) ypercent exact figure(ytitle(% mortality) xtitle(Volume of cases) mlabel( hospital))}
{pmore} Winsorize 10% of high and low observations and adjust for multiplicative overdispersion. {p_end}
{pmore2}{cmd:. funnelinst prop deaths cases, xrangehigh(2000) ymin(0) ymax(8) ypercent winsor(10) over(mult) figure(ytitle(% mortality) xtitle(Volume of cases))}
{pstd}
{opt 2) funnelinst for a rate:}{p_end}
{pmore} Set-up {p_end}
{pmore2}{cmd:. use teenage.dta, clear} {p_end}
{pmore} Data are the under-18 conception rate for English health authorities, 2000–2001. The numerator here is the unadjusted number of teenage pregnancies as a rate per 1000.
The denominator is the population size. We specify that the maximum value on the x-axis is 26000 and the minimum displayed value on the y-axis is 0. We indicate
that the rate should be computed and displayed as "per thousand". {p_end}
{pmore2}{cmd:. funnelinst rate No2001 pop2001, xrangehigh(26000) pval(0.025 0.001) ymin(0) ratedenom(1000) figure(ytitle(Rate per 1000) xtitle(Population ('000s)))}
{pmore} Same as above but adjust for additive overdispersion after winsorizing 10% of the observations.
{pmore2}{cmd:. funnelinst rate No2001 pop2001, xrangehigh(26000) pval(0.025 0.001) ymin(0) ratedenom(1000) wins(10) over(add) figure(ytitle(Rate per 1000) xtitle(Population ('000s)))}
{pmore} Same as above but specifies "madjust" to account for multiple testing rather than use the specified {cmd:pval(#)}.
{pmore2}{cmd:. funnelinst rate No2001 pop2001, xrangehigh(26000) ymin(0) ratedenom(1000) wins(10) over(add) figure(ytitle(Rate per 1000) xtitle(Population ('000s))) madjust}
{pstd}
{opt 3) funnelinst for an indirect standardized rate (SMR):}{p_end}
{pmore} Set-up {p_end}
{pmore2}{bf:{cmd: . use cabg.dta, clear}} {p_end}
{pmore} Data are 30-day adjusted and unadjusted mortality rates following coronary artery bypass grafts in New York state, 1997–1999. Here we use all {cmd:funnelinst}
default settings.
{pmore2}{cmd:. funnelinst smr deaths expected}{p_end}
{pmore} Here we set a target range to be between 0 and 2, and modify the titles on the X and Y axes. {p_end}
{pmore2}{cmd:. funnelinst smr deaths expected, target(0 2) figure(ytitle(Observed number of deaths) xtitle(Expected number of deaths))}
{pmore} Same as above, but we specify that 'exact' statistics be used to construct the CIs {p_end}
{pmore2}{cmd:. funnelinst smr deaths expected, target(0 2) figure(ytitle(Observed number of deaths) xtitle(Expected number of deaths)) exact}
{pstd}
{opt 4) funnelinst for a continuous measure:}{p_end}
{pmore} Set-up {p_end}
{pmore2}{bf:{cmd: . use mean.dta, clear}} {p_end}
{pmore} These are artificial data of a hypothetical performance indicator measured on a continuous scale. We start by using all {cmd:funnelinst}
default settings. {p_end}
{pmore2}{cmd:. funnelinst mean means se}
{pmore} We now adjust for additive overdispersion, winsoring 10% of the observations.{p_end}
{pmore2}{cmd:. funnelinst mean means se, wins(10) over(add)}
{pmore} Same as above but we set the target to range between 4.5 and 5.5.{p_end}
{pmore2}{cmd:. funnelinst mean means se, wins(10) over(add) target(4.5 6.5)}
{pstd}
{opt 5) funnelinst for a change over time:}{p_end}
{pmore} Set-up {p_end}
{pmore2}{cmd:. use teenage.dta, clear} {p_end}
{pmore} We revisit the data for the under-18 conception rate for English health authorities. However, we now compare the
change in conception rates from 1998 to 2001 as a ratio (2001/1998). We use the government's target of a 7.5% reduction
(ratio = 0.925) as the target. The resulting funnel plot is exactly as that presented in Figure 4 of Spiegelhalter (2005a).{p_end}
{pmore2}{cmd:. funnelinst smr O1 E1 O2 E2, xrangehigh(1200) ymin(0.6) ymax(1.6) target(0.925) /// } {p_end}
{cmd: figure(xlabel(0(200)1200) xtitle("Average observed") ytitle("Ratio of Standardised rates (2001/1998)")) exact}
{pmore} Here we compute the ratio of the proportions between 1998 and 2001.{p_end}
{pmore2}{cmd:. funnelinst prop No1998 pop1998 No2001 pop2001 , xrangehigh(30000) figure(xtitle("Average observed") ///} {p_end}
{cmd: ytitle("Ratio of proportions (2001/1998)")) chtype(ratio) ymax(2)}
{pmore} Here we compute the difference in the proportions between 1998 and 2001.{p_end}
{pmore2}{cmd:. funnelinst prop No1998 pop1998 No2001 pop2001 , xrangehigh(30000) figure(xtitle("Average observed") ///} {p_end}
{cmd: ytitle("Difference in proportions (2001 vs 1998)")) chtype(diff) ymax(2)}
{marker results}{...}
{title:Stored results}
{pstd}
{cmd:funnelinst} stores the following in {cmd:r()}:
{synoptset 16 tabbed}{...}
{p2col 5 16 20 2: Scalars}{p_end}
{synopt:{cmd:r(N)}}The number of units{p_end}
{synopt:{cmd:r(target)}}The target (if there is only one){p_end}
{synopt:{cmd:r(target1)}}The lower specified target (if there are two specified){p_end}
{synopt:{cmd:r(target2)}}The higher specified target (if there are two specified){p_end}
{synopt:{cmd:r(winsor)}}The percent of observations winsorized{p_end}
{synopt:{cmd:r(phi)}}The dispersion value{p_end}
{synopt:{cmd:r(tau)}}The random-effects standard deviation (from additive overdispersion){p_end}
{synopt:{cmd:r(pbon)}}The Bonferroni adjusted p-value (if {cmd:madjust} is specified){p_end}
{synopt:{cmd:r(pfdr)}}The FDR adjusted p-value (if {cmd:madjust} is specified){p_end}
{p2colreset}{...}
{title:References}
{p 4 8 2}
Jones, H. E., Ohlssen, D. I. and D. J. Spiegelhalter. 2008. Use of the false discovery rate when comparing multiple health care providers.
{it:Journal of Clinical Epidemiology} 61: 232-240.
{p 4 8 2}
Spiegelhalter, D. J. 2005a. Funnel plots for comparing institutional performance. {it:Statistics in Medicine} 24: 1185-1202.
{p 4 8 2}
Spiegelhalter, D. J. 2005b. Handling over-dispersion of performance indicators. {it: BMJ Quality & Safety} 14: 347-351.
{p 4 8 2}
Spiegelhalter, D. J., Sherlaw-Johnson, C., Bardsley, M., Blunt, I., Wood, C. and O. Grigg. 2012. Statistical methods for healthcare regulation: rating, screening and surveillance.
{it:Journal of the Royal Statistical Society Series A: Statistics in Society} 175: 1-47.
{p 4 8 2}
Storey, J. D. and R. Tibshirani. 2003. Statistical significance for genomewide studies. {it:Proceedings of the National Academy of Sciences} 100: 9440-9445.
{marker citation}{title:Citation of {cmd:funnelinst}}
{p 4 8 2}{cmd:funnelinst} is not an official Stata command. It is a free contribution
to the research community, like a paper. Please cite it as such: {p_end}
{p 4 8 2}
Linden A. & D. J. Spiegelhalter (2024). FUNNELINST: Stata module for generating a funnel plot to compare institutional performance.
{browse "https://ideas.repec.org/c/boc/bocode/s459283.html":https://ideas.repec.org/c/boc/bocode/s459283.html} {p_end}
{title:Authors}
{p 4 4 2}
Ariel Linden{break}
Linden Consulting Group, LLC{break}
alinden@lindenconsulting.org{break}
{p 4 4 2}
David J Spiegelhalter{break}
Statistical Laboratory{break}
Centre for Mathematical Sciences {break}
University of Cambridge{break}
{title:Acknowledgments}
{p 4 4 2}
We thank Daniel Klein for providing the mata code to compute the inverse CDF of the binomial distribution and Poisson distribution for the "exact" option. We thank
Hayley Jones for responding to questions about implementing FDR. We also wish to thank John Moran for advocating that we write this package.
{title:Also see}
{p 4 8 2} Online: {helpb meta funnelplot}, {helpb funnelcompar} if installed {p_end}