{smcl} {* 30nov2022/10dec2022/21mar2023/2apr2023/16jul2023/28aug2023/3sep2023/30nov2023}{...} {hline} help for {hi:upsetplot} {hline} {title:Euler or Venn diagrams mapped to bar charts, upsetplot style} {p 8 12 2} {cmd:upsetplot} {it:varlist} [{it:weight}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}]{break} [ {cmd:,} {opt fillin} {opt percent} {opt frformat(str)} {opt pcformat(str)} {opt select(numlist)} {break} {opt varlabels} {opt sep:arator(str)} {opt gsort(str)} {opt baropts(str)} {opt labelopts(str)} {opt matrixopts(str)} {opt spikeopts(str)} {it:graph_options} {break} {opt savedata(filespec)} ] {p 4 4 2}{cmd:fweight}s and {cmd:aweight}s may be specified. {title:Description} {p 4 4 2} {cmd:upsetplot} produces by default bar chart alternatives to Euler or Venn diagrams showing the frequencies (meaning generally, abundances) of subsets of observations as defined jointly by a bundle of numeric indicator variables. Such plots resemble what have been called UpSetPlots elsewhere. There is scope to recast results to use some other subcommands of {cmd:twoway}. {p 4 4 2} The order of variables presented to {cmd:upsetplot} does not determine the order in which they are shown in a plot. By default bars (spikes, dotted lines) are shown in order of subset frequency, but choosing a more suitable order is the user's prerogative. There is considerable scope to change that sort order using other criteria. {p 4 4 2} Commonly, but not necessarily, subset frequencies (abundances) are already in a variable in the dataset. If so, that variable should be specified as frequency or analytic weights. If no weights are specified, {cmd:upsetplot} counts observations for you. Either way, note that the focus of this command is on displaying frequencies, and not the particular values in each subset. {p 4 4 2} The display uses various subcommands of {helpb graph twoway}. The most distinctive feature is a matrix- or table-like legend below the bar chart with marker symbols shown on each row whenever any indicator is 1 in a subset. Correspondingly, absence of a marker symbol implies that that indicator is 0 in that subset. Each row is labelled by a variable name or optionally by a variable label. In either case, variable name or variable label, short text is desirable. Vertical spikes connecting uppermost and lowermost marker symbols are added. {p 4 4 2} The reduced dataset used by {cmd:upsetplot} may be saved for future work using the {cmd:savedata()} option. This dataset may be as useful as or more useful than the plot. Saving results allows greater flexibility in plotting. Tabulation or other reporting is also made easier. Results often need to be scaled in some way, e.g. by looking at conditional proportions or percents. (Optionally, percents can be saved too.) {p 4 4 2} The variables in such a reduced dataset are the original indicator variables and as follows. The names here may thus not be used as names for the indicator variables specified. {p 8 8 2} {cmd:_binary} is a string variable containing a binary code such as {cmd:"00"}, {cmd:"01"}, {cmd:"10"} or {cmd:"11"}. {p 8 8 2} {cmd:_decimal} is a numeric variable containing a decimal equivalent such as {cmd:0}, {cmd:1}, {cmd:2} or {cmd:3}. {p 8 8 2} {cmd:_text} is a string variable containing a description of each subset using variable names or optionally variable labels. The text {cmd:""} is reported for any subset which would otherwise have empty text. {p 8 8 2} {cmd:_freq} is a numeric variable containing the frequency of occurrence of each subset. If analytic weights were specified, values may have fractional parts. Otherwise values will be integer counts. {p 8 8 2} (Optionally) {cmd:_percent} is a numeric variable containing the percent occurrence of each subset. {p 8 8 2} {cmd:_degree} is a numeric variable indicating the degree of each subset (number of participating sets), counted as true (1) according to each indicator variable. {p 8 8 2} {cmd:_set} is a string variable that indicates each set using its variable name or optionally its variable label. {p 8 8 2} {cmd:_setfreq} is a numeric variable that indicates the frequency of each set. {p 8 8 2} {cmd:_set} and {cmd:_setfreq} are physically but not logically aligned with the other variables mentioned above. {p 8 8 2} Allenby and Slomson (2011, p.14) comment: "There is, unfortunately, no standard notation for the number of elements in a set". They could have added "and no standard term either". Terms encountered (other than "number of elements") include cardinality, order, potency, power, and the homely size. See annotations of several references. {p 4 4 2} Unwin (2015, pp.179, 180, 182) independently used plots with similar content but different style to show the structure of missing values. For more examples, see Unwin (2024). {p 4 4 2} More detailed Remarks, including many further references, follow later in this help. {title:Options} {p 0 0 2}{it:What to show} {p 4 4 2} {cmd:fillin} insists on showing subsets that do not occur with their frequency zero. This can be helpful if there are only a few such subsets, but not usually otherwise. {p 4 4 2} {cmd:percent} specifies listing and plotting of percents rather than frequencies. {p 4 4 2} {cmd:frformat()} specifies a display format for frequencies in listings. This option may be appropriate if any frequencies include fractional parts. If you wish to specify a format for the bar labels, you should do so directly, using {cmd:labelopts(mlabformat())}. {p 4 4 2} {cmd:pcformat()} specifies a display format for percents in listings. The default is {cmd:%2.1f}. This option has no effect without {cmd:percent}. If you wish to specify a format for the bar labels, you should do so directly, using {cmd:labelopts(mlabformat())}. {p 4 4 2} {cmd:select()} specifies that only the first so many bars be shown. That is, without this option the bars that would be shown are considered to be numbered 1, 2, 3 and so on, from left to right. {cmd:select(1/10)} would reduce the display to the first 10 bars. Typically, this option would be used to select the most frequent subsets, but the syntax allows any integer {help numlist}. {p 0 0 2}{it:Detail of display} {p 4 4 2} {cmd:varlabels} specifies use of variable labels to describe each subset. The default is to use variable names. In either case, only variables taking on value 1 in each subset are named or labelled. If a variable label has not been defined, the variable name is used instead. {p 4 4 2} {cmd:separator()} specifies a string to separate variable names or, as above, variable or value labels in display of subsets. The default is {cmd:", "}, a comma followed by a space. Hint: the intersection symbol can be obtained using SMCL's {cmd:"{c -(}&cap{c )-}"} or Unicode character (U+2229) through {cmd:uchar(2229)}. The SMCL notation will be interpreted on graphs, but will appear uninterpreted in data listings. The Unicode character should be interpreted in both. {p 4 4 2} {cmd:gsort()} specifies instructions to {helpb gsort} on the order of bars, mentioning one or more variable names for sorting the display. The variable(s) named must be included in the reduced dataset as defined above and could be one or more of the following: {cmd:_binary} {cmd:_decimal} {cmd:_text} {cmd:_freq} {cmd:_percent} (if specified) {cmd:_degree} {p 4 4 2} The default is to sort on the frequency (abundance) variable {cmd:_freq} created by the command, highest values first. {p 4 4 2} {cmd:axisgap()} specifies a gap between the {it:x} axis and the first row of the legend. The default is (the maximum value of {cmd:_freq} or {cmd:_percent}) / 100. After the command has run the value used is accessible as local macro {cmd:axisgap}. That allows one or more extra passes to change the gap. {p 4 4 2} {cmd:vargap()} specifies the gap between each row in the legend. The default is (the maximum value of {cmd:_freq} or {cmd:_percent})/25. After the command has run the value used is accessible as local macro {cmd:vargap}. That allows one or more extra passes to change the gap. {p 4 4 2} {cmd:baropts()} are options of {helpb twoway bar} used to tune the rendering of bars. The defaults include {cmd:yla(, ang(h)) barw(0.8) xsc(off)}. {p 4 4 2} {cmd:labelopts()} are options of {helpb twoway scatter} used to tune the rendering of text labels showing frequencies or percents above each bar. The defaults are {cmd:mla(_freq} or {cmd:mla(_percent)} with {cmd:ms(none) mlabc(black) mlabpos(12) mlabsize(small)}. Alternatively, {cmd:labelopts(none)} suppresses such labels. {p 4 4 2} {cmd:matrixopts()} are options of {helpb twoway scatter} used to tune the matrix- or table-like legend. The defaults include {cmd:ymla(, ang(h) noticks) legend(off) aspect(0.8) ms(O T D S + X)} and marker colours as defined by Okabe and Ito (2008) (on which see e.g. Wong (2011) or Wilke (2019)). {p 8 8 2} Note that something like {cmd:matrixopts(ms(O ..) mc(gs8 ..))} would get you closer to conforming with a widespread if unimaginative convention. {p 4 4 2} {cmd:spikeopts()} are options of {helpb twoway rspike} used to tune the spikes connecting markers in the legend. The default is {cmd:lc(gs8)}. {p 4 4 2} {it:graph_options} are other options of {helpb graph}. An example might be {cmd:name()}. Note that {cmd:graph} may not be especially smart about any space needed above the highest bar label, so you may need two passes and a call to {cmd:yscale()} to extend the axis. {p 0 0 2}{it:Saving results as new dataset} {p 4 4 2} {cmd:savedata()} specifies a (filepath and) filename for saving results to a new dataset. The specification may include {cmd:, replace} {c -} which is needed to replace any existing dataset with the same path and name. {title:Examples} {p 4 8 2}{cmd:. local bcolour lcolor(blue) fcolor(blue*0.3)}{p_end} {p 4 8 2}{cmd:. set more off}{p_end} {p 4 8 2}{cmd:. set scheme s1color}{p_end} {p 4 8 2}{cmd:. * EXAMPLE 1}{p_end} {p 4 8 2}{cmd:. * Schnable et al. 2009 counts of gene families}{p_end} {p 4 8 2}{cmd:. clear}{p_end} {p 4 8 2}{cmd:. input Rice Maize Sorghum Arabidopsis freq}{p_end} {p 4 8 2}{cmd:1 0 0 0 1110}{p_end} {p 4 8 2}{cmd:1 1 0 0 229}{p_end} {p 4 8 2}{cmd:0 1 0 0 465}{p_end} {p 4 8 2}{cmd:1 0 1 0 661}{p_end} {p 4 8 2}{cmd:1 1 1 0 2077}{p_end} {p 4 8 2}{cmd:0 1 1 0 405}{p_end} {p 4 8 2}{cmd:0 0 1 0 265}{p_end} {p 4 8 2}{cmd:1 0 1 1 304}{p_end} {p 4 8 2}{cmd:1 1 1 1 8494}{p_end} {p 4 8 2}{cmd:0 1 1 1 112}{p_end} {p 4 8 2}{cmd:0 0 1 1 34}{p_end} {p 4 8 2}{cmd:1 0 0 1 81}{p_end} {p 4 8 2}{cmd:1 1 0 1 96}{p_end} {p 4 8 2}{cmd:0 1 0 1 11}{p_end} {p 4 8 2}{cmd:0 0 0 1 1058}{p_end} {p 4 8 2}{cmd:end}{p_end} {p 4 8 2}{cmd:. label var Arabidopsis "{c -(}it:Arabidopsis{c )-}"}{p_end} {p 4 8 2}{cmd:. local toptitle "t1title(Number of gene families)"}{p_end} {p 4 8 2}{cmd:. upsetplot A R M S [fw=freq], varlabels baropts(`toptitle' `bcolour') name(UP1, replace)}{p_end} {p 4 8 2}{cmd:. upsetplot A R M S [fw=freq], varlabels baropts(`toptitle' `bcolour') gsort(_decimal) name(UP2, replace)}{p_end} {p 4 8 2}{cmd:. upsetplot A R M S [fw=freq], varlabels gsort(_degree -_freq) baropts(`toptitle' `bcolour') name(UP3, replace)}{p_end} {p 4 8 2}{cmd:. upsetplot A R M S [fw=freq], varlabels gsort(-_degree -_freq) baropts(`toptitle' `bcolour') name(UP4, replace)}{p_end} {p 4 8 2}{cmd:. tempfile schnable}{p_end} {p 4 8 2}{cmd:. upsetplot A R M S [fw=freq], varlabels gsort(-_degree -_freq) baropts(`toptitle' `bcolour') savedata("`schnable'")}{p_end} {p 4 8 2}{cmd:. use "`schnable'", clear}{p_end} {p 4 8 2}{cmd:. graph hbar (asis) _setfreq, over(_set, sort(1)) bar(1, `bcolour') blabel(bar) ysc(off) `toptitle' name(UP5, replace)}{p_end} {p 4 8 2}{cmd:. * EXAMPLE 2}{p_end} {p 4 8 2}{cmd:. * D'Hont et al. 2012}{p_end} {p 4 8 2}{cmd:. clear}{p_end} {p 4 8 2}{cmd:. input byte(Phoenix Musa Brachypodium Sorghum Oryza Arabidopsis) float freq str52 name}{p_end} {p 4 8 2}{cmd:1 1 1 1 1 1 7674 "Phoenix Musa Brachypodium Sorghum Oryza Arabidopsis"}{p_end} {p 4 8 2}{cmd:1 1 1 1 1 0 685 "Phoenix Musa Brachypodium Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:1 1 1 1 0 1 113 "Phoenix Musa Brachypodium Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 1 1 0 0 24 "Phoenix Musa Brachypodium Sorghum" }{p_end} {p 4 8 2}{cmd:1 1 1 0 1 1 80 "Phoenix Musa Brachypodium Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 1 0 1 0 18 "Phoenix Musa Brachypodium Oryza" }{p_end} {p 4 8 2}{cmd:1 1 1 0 0 1 7 "Phoenix Musa Brachypodium Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 1 0 0 0 12 "Phoenix Musa Brachypodium" }{p_end} {p 4 8 2}{cmd:1 1 0 1 1 1 149 "Phoenix Musa Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 0 1 1 0 62 "Phoenix Musa Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:1 1 0 1 0 1 23 "Phoenix Musa Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 0 1 0 0 19 "Phoenix Musa Sorghum" }{p_end} {p 4 8 2}{cmd:1 1 0 0 1 1 28 "Phoenix Musa Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 0 0 1 0 35 "Phoenix Musa Oryza" }{p_end} {p 4 8 2}{cmd:1 1 0 0 0 1 206 "Phoenix Musa Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 1 0 0 0 0 467 "Phoenix Musa" }{p_end} {p 4 8 2}{cmd:1 0 1 1 1 1 258 "Phoenix Brachypodium Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 1 1 1 0 190 "Phoenix Brachypodium Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:1 0 1 1 0 1 11 "Phoenix Brachypodium Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 1 1 0 0 23 "Phoenix Brachypodium Sorghum" }{p_end} {p 4 8 2}{cmd:1 0 1 0 1 1 5 "Phoenix Brachypodium Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 1 0 1 0 12 "Phoenix Brachypodium Oryza" }{p_end} {p 4 8 2}{cmd:1 0 1 0 0 1 3 "Phoenix Brachypodium Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 1 0 0 0 25 "Phoenix Brachypodium" }{p_end} {p 4 8 2}{cmd:1 0 0 1 1 1 21 "Phoenix Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 0 1 1 0 42 "Phoenix Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:1 0 0 1 0 1 4 "Phoenix Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 0 1 0 0 49 "Phoenix Sorghum" }{p_end} {p 4 8 2}{cmd:1 0 0 0 1 1 6 "Phoenix Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 0 0 1 0 32 "Phoenix Oryza" }{p_end} {p 4 8 2}{cmd:1 0 0 0 0 1 105 "Phoenix Arabidopsis" }{p_end} {p 4 8 2}{cmd:1 0 0 0 0 0 769 "Phoenix" }{p_end} {p 4 8 2}{cmd:0 1 1 1 1 1 1458 "Musa Brachypodium Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 1 1 1 0 368 "Musa Brachypodium Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:0 1 1 1 0 1 54 "Musa Brachypodium Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 1 1 0 0 13 "Musa Brachypodium Sorghum" }{p_end} {p 4 8 2}{cmd:0 1 1 0 1 1 29 "Musa Brachypodium Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 1 0 1 0 28 "Musa Brachypodium Oryza" }{p_end} {p 4 8 2}{cmd:0 1 1 0 0 1 7 "Musa Brachypodium Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 1 0 0 0 9 "Musa Brachypodium" }{p_end} {p 4 8 2}{cmd:0 1 0 1 1 1 71 "Musa Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 0 1 1 0 64 "Musa Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:0 1 0 1 0 1 21 "Musa Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 0 1 0 0 49 "Musa Sorghum" }{p_end} {p 4 8 2}{cmd:0 1 0 0 1 1 13 "Musa Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 0 0 1 0 29 "Musa Oryza" }{p_end} {p 4 8 2}{cmd:0 1 0 0 0 1 155 "Musa Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 1 0 0 0 0 759 "Musa" }{p_end} {p 4 8 2}{cmd:0 0 1 1 1 1 206 "Brachypodium Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 1 1 1 0 2809 "Brachypodium Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:0 0 1 1 0 1 14 "Brachypodium Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 1 1 0 0 402 "Brachypodium Sorghum" }{p_end} {p 4 8 2}{cmd:0 0 1 0 1 1 18 "Brachypodium Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 1 0 1 0 547 "Brachypodium Oryza" }{p_end} {p 4 8 2}{cmd:0 0 1 0 0 1 10 "Brachypodium Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 1 0 0 0 387 "Brachypodium" }{p_end} {p 4 8 2}{cmd:0 0 0 1 1 1 40 "Sorghum Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 0 1 1 0 1151 "Sorghum Oryza" }{p_end} {p 4 8 2}{cmd:0 0 0 1 0 1 9 "Sorghum Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 0 1 0 0 827 "Sorghum" }{p_end} {p 4 8 2}{cmd:0 0 0 0 1 1 6 "Oryza Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 0 0 1 0 1246 "Oryza" }{p_end} {p 4 8 2}{cmd:0 0 0 0 0 1 1187 "Arabidopsis" }{p_end} {p 4 8 2}{cmd:0 0 0 0 0 0 . "" }{p_end} {p 4 8 2}{cmd:end}{p_end} {p 4 8 2}{cmd:. local toptitle "t1title(Number of gene families)"}{p_end} {p 4 8 2}{cmd:. upsetplot P-A [w=freq], baropts(`toptitle' `bcolour' ysc(r(. 8500))) labelopts(mlabang(v) mlabpos(1) mlabsize(vsmall)) name(UP6, replace)}{p_end} {p 4 8 2}{cmd:. * EXAMPLE 3}{p_end} {p 4 8 2}{cmd:. * incidence of missing values in nlswork.dta}{p_end} {p 4 8 2}{cmd:. webuse nlswork, clear}{p_end} {p 4 8 2}{cmd:. * missings from Stata Journal: search dm0085, entry}{p_end} {p 4 8 2}{cmd:. capture noisily missings report}{p_end} {p 4 8 2}{cmd:. foreach v in ind_code union wks_ue tenure wks_work {c -(}}{p_end} {p 4 8 2}{cmd:. {space 4}gen M`v' = missing(`v')}{p_end} {p 4 8 2}{cmd:. {space 4}label var M`v' "`v'"}{p_end} {p 4 8 2}{cmd:. {c )-}}{p_end} {p 4 8 2}{cmd:. local toptitle "t1title(Number of missing values)"}{p_end} {p 4 8 2}{cmd:. upsetplot M*, varlabels baropts(`toptitle' `bcolour') name(UP7, replace)}{p_end} {p 4 8 2}{cmd:. upsetplot M* if missing(ind_code, union, wks_ue, tenure, wks_work), varlabels baropts(`toptitle' `bcolour') gsort(_degree -_freq) name(UP8, replace)}{p_end} {p 4 8 2}{cmd:. * EXAMPLE 4}{p_end} {p 4 8 2}{cmd:. * various indicators in nlswork.dta}{p_end} {p 4 8 2}{cmd:. webuse nlswork, clear}{p_end} {p 4 8 2}{cmd:. local toptitle "t1title(Number of people)"}{p_end} {p 4 8 2}{cmd:. upsetplot nev_mar c_city collgrad south, baropts(`toptitle' `bcolour') name(UP9, replace)}{p_end} {p 4 8 2}{cmd:. label var nev_mar "never married"}{p_end} {p 4 8 2}{cmd:. label var c_city "central city"}{p_end} {p 4 8 2}{cmd:. label var collgrad "college graduate"}{p_end} {p 4 8 2}{cmd:. label var south "South"}{p_end} {p 4 8 2}{cmd:. upsetplot nev_mar c_city collgrad south, varlabels baropts(`toptitle' `bcolour') name(UP10, replace)}{p_end} {p 4 8 2}{cmd:. sortmean nev_mar c_city collgrad south, desc}{p_end} {p 4 8 2}{cmd:. upsetplot `sortlist', varlabels baropts(`toptitle' `bcolour') name(UP11, replace)}{p_end} {title:Authors} {p 4 4 2}Tim P. Morris, MRC Clinical Trials Unit, University College London{break} tim.morris@ucl.ac.uk {p 4 4 2}Nicholas J. Cox, Durham University{break} n.j.cox@durham.ac.uk {title:Acknowledgments} {p 4 4 2} This paper arose from a conversation between the authors at the London Stata meeting in 2022. We much appreciate the enterprise, energy, and enthusiasm behind these meetings over almost 30 years shown by the late Ana Timberlake, Teresa Timberlake, David Corbett, and their colleagues. {p 4 4 2} Angela Wood originally drew the attention of TPM to upsetplots. Antony Unwin provided NJC access to his forthcoming book. Several posts on Statalist indicated very helpfully both interest in this problem and reactions to earlier versions of these commands. {title:Also see} {p 4 4 2}Help for{break} {help misstable}{break} {help sortmean} (if installed){break} {help groups} ({it:Stata Journal}) (if installed){break} {help vennbar} (if installed){break} {help jaccard} (if installed){break} {help findname} ({it:Stata Journal}) (if installed){break} {help vorter} (if installed) {title:Remarks} {it:Explanation and advice} {p 4 4 2} {cmd:upsetplot} requires a bundle of numeric variables with values 0 or 1. Such variables are variously called indicator, dummy, binary, dichotomous, zero-one, one-hot, Boolean, logical or quantal. Missing values will be ignored. Otherwise presenting values other than 0 or 1 is considered an error. Observations used will thus have all values 0 or 1 in all variables specified. Differently put, {cmd:upsetplot} is not for string variables or categorical variables that have 3 or more distinct values. {p 4 4 2} If your dataset is already aggregated to frequencies or other measures of abundance, specify those as weights multiplying the indicator variables. {p 4 4 2} Consider two such indicators. The concatenations 00, 01, 10 and 11 define the 4 possible subsets defined by those variables, and also distinct binary codes for binary numbers 00 to 11, and also distinct decimal equivalents 0 to 3. Otherwise put, concatenation is here a simple and natural way to define composite categorical variables (Cox 2007). 00 is of degree 0, 01 and 10 are of degree 1, and 11 is of degree 2. Here, and indeed generally, leading zeros are retained as helpful reminders even though they might be considered redundant or ornamental. {p 4 4 2} Similarly, three such variables have eight possible binary concatenations 000, 001, 010, 011, 100, 101, 110, 111 and decimal equivalents 0 to 7. More generally, {it:k} such variables define 2^{it:k} possible subsets. {it:Pre-processing} {p 4 4 2} The order of variables presented determines the order in which they are shown in the legend. Choosing a suitable order is considered to be the user's responsibility. For example, if the data were medical symptoms exhibited by patients, a substantive grouping (say, cardiovascular symptoms all together) could make analytical sense. Otherwise, ordering variables by their means (equivalently, the frequency or abundance of states coded as 1) may be helpful. A utility {help sortmean} is distributed as ancillary with this package and an example above shows its use. See also {help vorter} from SSC. {p 4 4 2} Variables that are identically 0 or identically 1, at least in the data being shown, are not always useful and so might be omitted. {help findname} (Cox 2010, and {cmd:search findname, sj} for updates) can be used to find such variables through options {cmd:all(@ == 0)} or {cmd:all(@ == 1)}. Such calls can be extended to check for missing values, which this command will ignore any way. {p 4 4 2} Various {help egen} functions can be useful in selecting observations of particular interest. Thus {cmd:rowtotal()} yielding totals of 2 or more would identify occurrence of two or more conditions simultaneously. {it:Historical remarks and literature survey} {p 4 4 2} The elementary but fundamental idea of representing true (or present) as 1 and false (or absent) as 0 has a splendid history. Although it has yet longer roots, the idea was strongly developed by George Boole (1815{c -}1864): Boole (1854) was his major work in this territory, on which see particularly Grattan-Guinness (2005). Boole has been given a full-length biography (MacHale 2014) and an even longer sequel (MacHale and Cohen 2018). For shorter accounts see Gardner (1969), Broadbent (1970), MacHale (2000, 2008), Heath and Seneta (2001), or Grattan-Guinness (2004). See (e.g.) Dewdney (1993) or Gregg (1998) for samples of how such Boolean algebra features in computing. See Knuth (1998, Ch.4.1) for an excellent historical summary of positional number systems and Knuth (2011) for a masterly synopsis, including historical material, of related combinatorial algorithms. See Strickland and Lewis (2022) for a focus on binary arithmetic and logic in the work of Leibniz (1646{c -}1716). The leading biography of Leibniz is by Antognazza (2009), although the earlier biography by Aiton (1985) is still very helpful. Leibniz's projects feature in many sub-plots in Stephenson (2003) and its sequels. Cox (2016) makes further Stata-related comments on truth, falsity, and indication. Cox and Schechter (2019) survey the creation of indicator variables in Stata. {p 4 4 2} Various commentators, from Leibniz onward, have seen anticipations of binary arithmetic in the divination manual {it:I Ching} ({it:Yijing}, {it:Yi Jing}, {it:Yi King}, etc.). That seems exaggerated. See Gardner (1974} for a brisk discussion and Knuth (2011) and Strickland and Lewis (2022) for further comments. {p 4 4 2} For 2 or 3 variables Euler or Venn diagrams annotated with subset frequencies (or other information) are relatively easy to draw and sometimes to understand, but even for 4 or 5 variables they are harder to draw and even harder to understand. For say {it:k} = 5, 2^5 = 32, which poses a challenge to show data intelligibly. For say {it:k} = 10, 2^10 = 1024, which is often far too many subsets to work with simultaneously. However, the problem will be eased in practice if many of the possible subsets do not occur, or occur so rarely that they can be ignored. For modest values of {it:k}, bar charts may be a competitive alternative, which is the idea implemented here. {p 4 4 2} This {cmd:upsetplot} command was stimulated by what have been called UpSetPlots. (Presentations and implementations show some variety in use of upper or lower case and over whether the name is given with an extra space.) See Lex (2021), Lex and Gehlenberg (2014), Lex {it:et al.} (2014), Conway {it:et al.} (2017), and Ballarini {it:et al.} (2020). Lex (2022) explains the origin of the name as a play on "set" and because he was "upset" by Venn diagram alternatives in the literature (e.g. D'Hont {it:et al.} 2012). The idea that Venn diagrams are better replaced by bar chart alternatives is older (e.g. Kosara 2007) and indeed implicit in any decision to use bar charts when researchers are aware of Venn diagrams. The assertion "I would argue that Venn diagrams are a great tool for learning about sets, but useless as a visualization" (Kosara 2007) is unfortunately supported by many examples in various literatures. {p 4 4 2} Hamming (1991, pp.16{c -}17) commended Venn diagrams for simple cases yet continued: "But if you try to go to very many subsets then the problem of drawibg a diagram which will show clearly what you are doing is often difficult. Circles are not, of course, necessary but when you are forced to draw very snake-like regions then the diagram is of little help in visualizing the situation." {p 4 4 2} Gleason (1991, p.33) commented that Venn diagrams become unwieldy for a number of sets "exceeding 4 or 5". {p 4 4 2} Venn diagrams are widely familiar in mathematics and science and indeed as a cultural meme echoed in cartoons, T-shirt or mug designs, and much else. Christianson (2012) mentioned Venn diagrams as one of {it:100 diagrams that changed the world}. Friendly introductions to set theory featuring Venn diagrams include Stewart (1975) and Gullberg (1997). Conversely, compare Hamming (1985, p.367): "Set theory has been taught until the typical student is weary of it, so we will assume that it is familiar." Beyond their original and continuing use in logic, Venn diagrams are commonly used in introductions to probability: see (e.g.) Pitman (1993), Whittle (2000), Dekking {it:et al.} (2005), Miller (2017), or Blitzstein and Hwang (2019). Historically and to the present set theory is linked to much fundamental work in logic, number theory, and other parts of mathematics (Bagaria 2008; various chapters in Grattan-Guinness 1994; Stillwell 2010). {p 4 4 2} For the history of Venn and related diagrams, see Baron (1969), Gardner (1982), Edwards (2004), Moktefi and Shin (2012), and Bennett (2015). Wainer and Friendly (2021, pp.102{c -}103) flag the use of an area-proportional Venn-like diagram by Playfair (1801, opp.p.48). Wilkinson (2012) covers some more recent work on drawing area-proportional plots from a statistical point of view. Macfarlane (1885, 1890) referred to composite categories laid out in sequence as the logical spectrum. {p 4 4 2} Venn (1880a, 1880b, 1880c, 1881, 1894) made explicit that the diagrams later named after him grew out of earlier work. Indeed, few logicians were as fully aware of previous contributions. Thus the name exemplifies Stigler's Law (1980, 1999) that "No scientific discovery is named after its original discoverer". The injustice is partially corrected by crediting Euler's earlier work (1768), on which see conveniently Sandifer (2007) or Bennett (2015). A distinction is often drawn (e.g. Mollerup 2015, p.166) that Venn diagrams show all possible combinations, while Euler diagrams only show actual combinations. However, Euler's contribution in turn was preceded by yet earlier work by Leibniz and several other scholars. Nevertheless, crediting Euler or Venn is fair and there is no point to suggesting yet another term when both terms are so well established. {p 4 4 2} John Venn (1834{c -}1923) now benefits from a full-length biography (Verburgt 2022). For shorter appreciations, see Broadbent (1976), Grattan-Guinness (2001), or Gibbins (2004). Grattan-Guinness (2011) places the work of Boole and Venn in context, surveying the development of logic in 19th century Britain. Venn's interest in probability and statistics was profound: see especially his first book {it:The Logic of Chance} (1866, 1876, 1888) and a still useful review paper on averages (Venn 1891). {p 4 4 2} Leonhard Euler (1707{c -}1783) is also well served by a full-length biography (Calinger 2016). See also Calinger, Denisova and Polyakhova (2019) on what in English is known as {it:Letters to a German Princess}. For a concise overview of some of his mathematical achievements, see Dunham (1999). For a shorter although still detailed account, see Youschkevitch (1971). For a very concise account, see Sandifer (2008). {p 4 4 2} For implementations of Venn diagrams in Stata, see (e.g.) Lauritsen (1999a, 1999b, 1999c, 2000, 2009), Gong and Ostermann (2011), and Over (2022). {title:References} {p 4 8 2}Aiton, E.J. 1985. {it:Leibniz: A Biography.} Bristol: Adam Hilger. {p 4 8 2} Allenby, R.B.J.T. and A. Slomson. 2011. {it:How to Count: An Introduction to Combinatorics.} Boca Raton, FL: CRC Press. {p 4 8 2} Antognazza, M.R. 2009. {it:Leibniz: An Intellectual Biography.} New York: Cambridge University Press. {p 4 8 2} Bagaria, J. 2008. Set theory. In Gowers, T. (ed.) {it:The Princeton Companion to Mathematics.} Princeton, NJ: Princeton University Press, 615{c -}634. {p 4 8 2} Ballarini, N.M., Y-D. Chiu, F. K{c o:}nig, M. Posch, and T. Jaki. 2020. A critical review of graphics for subgroup analyses in clinical trials. {it:Pharmaceutical Statistics} 19: 541{c -}560. {browse "https://doi.org/10.1002/pst.2012":https://doi.org/10.1002/pst.2012} {p 4 8 2} Baron, M.E. 1969. A note on the historical development of logic diagrams: Leibniz, Euler and Venn. {it:The Mathematical Gazette} 53: 113{c -}125. doi:10.2307/3614533 {p 4 8 2} Bennett, D. 2015. Origins of the Venn diagram. In Zack, M. and E. Landry (eds) {it:Research in History and Philosophy of Mathematics.} New York: Springer, 105{c -}120. {browse "https://logic-teaching.github.io/pred/texts/Bennett%202015%20-%20Origins%20of%20the%20Venn%20Diagram.pdf":https://logic-teaching.github.io/pred/texts/Bennett%202015%20-%20Origins%20of%20the%20Venn%20Diagram.pdf} {p 4 8 2} Biggs, N.L. 2002. {it:Discrete Mathematics.} Oxford: Oxford University Press. p.48 size, cardinality {p 4 8 2} Blitzstein, J.K. and J. Hwang. 2019. {it:Introduction to Probability.} Boca Raton, FL: CRC Press. {p 4 8 2} Boole, G. 1854. {it:An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities.} London: Walton and Maberley; Cambridge: Macmillan. {p 4 8 2} Broadbent, T.A.A. 1970. Boole, George. In Gillispie, C.C. (ed.} {it:Dictionary of Scientific Biography.} New York: Charles Scribner's Sons 2: 293{c -}298. {p 4 8 2} Broadbent, T.A.A. 1976. Venn, John. In Gillispie, C.C. (ed.} {it:Dictionary of Scientific Biography.} New York: Charles Scribner's Sons 13: 611{c -}613. {p 4 8 2} Calinger, R.S. 2016. {it:Leonhard Euler: Mathematical Genius in the Enlightenment.} Princeton, NJ: Princeton University Press. {p 4 8 2} Calinger, R.S., E. Denisova and E.N. Polyakhova. 2019. {it: Leonhard Euler's Letters to a German Princess: A Milestone in the History of Physics Textbooks and More.} San Rafaei, CA: Morgan & Claypool. {p 4 8 2} Cameron, P.J. 1994. {it:Combinatorics: Topics, Techniques, Algorithms.} Cambridge: Cambridge University Press. p.16 cardinality {p 4 8 2} Christianson, S. 2012. {it:100 Diagrams That Changed the World: From the Earliest Cave Paintings to the Innovation of the iPod.} New York: Penguin. {p 4 8 2} Conway, J.R., A. Lex and N. Gehlenborg. 2017. UpSetR: An R package for the visualization of intersecting sets and their properties. {it:Bioinformatics} 33: 2938{c -}2940. doi:10.1093/bioinformatics/btx364. {p 4 8 2} Cox, N.J. 2007. Stata tip 52: Generating composite categorical variables. {it:Stata Journal} 7: 582{c -}583. {p 4 8 2} Cox, N.J. 2010. Speaking Stata: Finding variables. {it:Stata Journal} 10: 281{c -}296. {p 4 8 2} Cox, N.J. 2016. Truth, falsity, indication, and negation. {it:Stata Journal} 16: 229{c -}236. {p 4 8 2} Cox, N.J. 2017. Tables as lists: The groups command. {it:Stata Journal} 17: 760{c -}773. {p 4 8 2} Cox, N.J. 2018. Software update: Tables as lists: The groups command. {it:Stata Journal} 18: 291. {p 4 8 2} Cox, N.J. and C.B. Schechter. 2019. How best to generate indicator or dummy variables. {it:Stata Journal} 19: 246{c -}259. {p 4 8 2} Crossley, J.N., J. Ash, C.J. Brickhill, J.C. Stillwell and N.H. Williams. 1972. {it:What is Mathematical Logic?} London: Oxford University Press. p.69 cardinality {p 4 8 2} Dekking, F.M., C. Kraikamp, H.P. Lopuha{c a:} and L.E. Meester. 2005. {it:A Modern Introduction to Probability and Statistics.} London: Springer. {p 4 8 2} Dewdney, A.K. 1993. {it:The New Turing Omnibus: 66 Excursions in Computer Science.} New York: Henry Holt. {p 4 8 2} D'Hont, A. and many authors. 2012. The banana ({it:Musa acuminata}) genome and the evolution of monocotyledonous plants. {it:Nature} 488: 213{c -}217. https://doi.org/10.1038/nature11241 {p 4 8 2} Dunham, W. 1999. {it:Euler: The Master of Us All.} Washington, DC: Mathematical Association of America. {p 4 8 2} Edwards, A.W.F. 2004. {it:Cogwheels of the Mind: The Story of Venn Diagrams.} Baltimore: Johns Hopkins University Press. {p 4 8 2} Euler, L. 1768. {it:Lettres {c a'g} Une Princesse d'Allemagne sur Divers Sujets de Physique & de Philosophie.} Tome second. Saint Petersbourg: L'Acad{c e'}mie Imp{c e'}riale des Sciences. {p 4 8 2} Freeman, S.C., E. Saeedi, J.M. Ordóñez-Mena, C.R. Nevill, J. Hartmann-Boyce, D.M. Caldwell, N.J. Welton, N.J. Cooper and A.J. Sutton. 2023. Data visualisation approaches for component network meta-analysis: visualising the data structure. {it:BMC Medical Research Methodology} 23, 208. https://doi.org/10.1186/s12874-023-02026-z {p 4 8 2} Friendly, M. and H. Wainer. 2021. {it:A History of Data Visualization and Graphic Communication.} Cambridge, MA: Harvard University Press. {p 4 8 2} Gardner, M. 1969. Boolean algebra, Venn diagrams and the propositional calculus. {it:Scientific American} 220(2): 110{c -}114. {browse "https://www.jstor.org/stable/pdf/24926287.pdf":https://www.jstor.org/stable/pdf/24926287.pdf} Reprinted as Boolean algebra in 1979. {it:Mathematical Circus.} New York: Alfred A. Knopf, 87{c -}101. {p 4 8 2} Gardner, M. 1974. The combinatorial basis of the "I Ching," the Chinese book of divination and wisdom. {it:Scientific American} 230(1): 108{c -}113. {browse "https://www.jstor.org/stable/pdf/24949988.pdf":https://www.jstor.org/stable/pdf/24949988.pdf} Reprinted as The {it:I Ching} in 1986. {it:Knotted Doughnuts and other Mathematical Entertainments.} New York: W.H. Freeman, 243{c -}256. {p 4 8 2} Gardner, M. 1982. {it:Logic Machines and Diagrams.} Chicago: University of Chicago Press. {p 4 8 2} Gibbins, J.R. 2004. Venn, John. In Matthew, H.C.G. and B. Harrison (eds) {it:Oxford Dictionary of National Biography.} Oxford: Oxford University Press 56: 259{c -}260. {p 4 8 2} Gleason, A.M. 1991. {it:Fundamentals of Abstract Analysis.} Bostnn, MA: Jones and Bartlett. [original publication: 1966. Reading, MA: Addison-Wesley.] {p 4 8 2} Gong, W. and J. Ostermann. 2011. pvenn: module to create proportional Venn diagram. http://fmwww.bc.edu/RePEc/bocode/p [accessed 20 September 2022] {p 4 8 2} Graham, R.L., D.E. Knuth and O. Patashnik. 1994. {it:Concrete Mathematics: A Foundation for Computer Science.} Reading, MA: Addison-Wesley. p.xi cardinality {p 4 8 2} Grattan-Guinness, I. (ed.) 1994. {it:Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences.} London: Routledge. {p 4 8 2} Grattan-Guinness, I. 2001. John Venn. In Heyde, C.C. and E. Seneta (eds) {it:Statisticians of the Centuries.} New York: Springer, 194{c -}196. {p 4 8 2} Grattan-Guinness, I. 2004. Boole, George. In Matthew, H.C.G. and B. Harrison (eds) {it:Oxford Dictionary of National Biography.} Oxford: Oxford University Press 6: 582{c -}585. {p 4 8 2} Grattan-Guinness, I. 2005. George Boole, {it:An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities} (1854). In Grattan-Guinness, I. (ed.) {it:Landmark Writings in Western Mathematics 1640{c -}1940.} Amsterdam: Elsevier, 470{c -}479. {p 4 8 2} Grattan-Guinness, I. 2011. Victorian logic: From Whately to Russell. In Flood, R., A. Rice and R. Wilson (eds) {it:Mathematics in Victorian Britain.} Oxford: Oxford University Press, 359{c -}374. {p 4 8 2} Green, J.A. 1965. {it:Sets and Groups.} London: Routledge and Kegan Paul. p.2 order [of a finite set] {p 4 8 2} Green, J.A. 1988. {it:Sets and Groups: A First Course in Algebra.} London: Chapman & Hall. p.2 order [of a finite set] {p 4 8 2} Gregg, J.R. 1998. {it:Ones and Zeros: Understanding Boolean Algebra, Digital Circuits, and the Logic of Sets.} Piscataway, NJ: IEEE Press. {p 4 8 2} Grimmett, G.R. and D.R. Stirzaker. 2020. {it:Probability and Random Processes.} Oxford: Oxford University Press. p.6 cardinality {p 4 8 2} Gullberg, J. 1997. {it:Mathematics: From the Birth of Numbers.} New York: W.W. Norton. {p 4 8 2} Hamming, R.W. 1985. {it:Methods of Mathematics Applied to Calculus, Probability, and Statistics.} Englewood Cliffs, NJ: Prentice-Hall. {p 4 8 2} Hamming, R.W. 1991. {it:The Art of Probability for Scientists and Engineers.} Reading, MA: Addison-Wesley. {p 4 8 2} Heath, P. and E. Seneta. 2001. George Boole. In Heyde, C.C. and E. Seneta (eds) {it:Statisticians of the Centuries.} New York: Springer, 167{c -}170. {p 4 8 2} Ito, K. (ed.) 1987. {it:Encyclopedic Dictionary of Mathematics.} Cambridge, MA: MIT Press. [macron accent on o of Ito] potency, power {p 4 8 2} James, G. and R.C. James. 1992. {it:Mathematics Dictionary.} New York: Van Nostrand Reinhold. potency, power {p 4 8 2} Knuth, D.E. 1997. {it:The Art of Computer Programming: Volume 1: Fundamental Algorithms.} Reading, MA: Addison-Wesley. p.625 cardinality {p 4 8 2} Knuth, D.E. 1998. {it:The Art of Computer Programming: Volume 2: Seminumerical Algorithms.} Reading, MA: Addison-Wesley. {p 4 8 2} Knuth, D.E. 2011. {it:The Art of Computer Programming: Volume 4A: Combinatorial Algorithms, Part 1.} Upper Saddle River, NJ: Addison-Wesley. {p 4 8 2} Kosara, R. 2007. Autism diagnosis accuracy {c -} Visualization redesign. {browse "https://eagereyes.org/criticism/autism-diagnosis-accuracy":https://eagereyes.org/criticism/autism-diagnosis-accuracy} [accessed 20 September 2022] {p 4 8 2} Lauritsen, J.M. 1999a. Drawing Venn diagrams. {it:Stata Technical Bulletin} 47: 3{c -}8. {p 4 8 2} Lauritsen, J.M. 1999b. Drawing Venn diagrams. {it:Stata Technical Bulletin} 48: 3. {p 4 8 2} Lauritsen, J.M. 1999c. Drawing Venn diagrams. {it:Stata Technical Bulletin} 49: 8. {p 4 8 2} Lauritsen, J.M. 2000. An update to drawing Venn diagrams. {it:Stata Technical Bulletin} 54: 17{c -}19. {p 4 8 2} Lauritsen, J.M. 2009. venndiag: module to generate Venn diagrams. http://fmwww.bc.edu/RePEc/bocode/v [accessed 20 September 2022] {p 4 8 2} Lex, A. 2021. UpSet: Visualizing intersecting sets. {browse "https://upset.app/":https://upset.app/} [accessed 20 September 2022] {p 4 8 2} Lex, A. 13 September 2022. {browse "https://mobile.twitter.com/alexander_lex/status/1569741352417787905": https://mobile.twitter.com/alexander_lex/status/1569741352417787905} [accessed 20 September 2022] {p 4 8 2} Lex, A. and N. Gehlenborg. 2014. Sets and intersections. {it:Nature Methods} 11: 779. doi:10.1038/nmeth.3033 {p 4 8 2} Lex, A., N. Gehlenborg, H. Strobelt, R. Vuillemot and H. Pfister. 2014. UpSet: Visualization of intersecting sets. {it:IEEE Transactions on Visualization and Computer Graphics} 20: 1983{c -}1992. doi:10.1109/TVCG.2014.2346248/ {p 4 8 2} Liebeck, M. 2016. {it:A Concise Introduction to Pure Mathematics.} Boca Raton, FL: CRC Press. p.195 cardinality {p 4 8 2} Macfarlane, A. 1885. The logical spectrum. {it:The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science} Series 5, 19 (119): 286{c -}290. doi:10.1080/14786448008626877. {p 4 8 2} Macfarlane, A. 1890. Adaption of the method of the logical spectrum to Boole's problem. {it:Proceedings of the American Association of the Advancement of Science} 39: 57{c -}60. {p 4 8 2} MacHale, D. 2000. George Boole 1815{c -}1864. In Houston, K. (ed.) {it:Creators of Mathematics: The Irish Connection.} Dublin: University College Dublin Press, 27{c -}32. {p 4 8 2} MacHale, D. 2008. George Boole. In Gowers, T. (ed.) {it:The Princeton Companion to Mathematics.} Princeton, NJ: Princeton University Press, 769{c -}770. {p 4 8 2} MacHale, D. 2014. {it:The Life and Work of George Boole: A Prelude to the Digital Age.} Cork: Cork University Press. {p 4 8 2} MacHale, D. and Y. Cohen. 2018. {it:New Light on George Boole.} Cork: Cork University Press. {p 4 8 2} Miller, S.J. 2017. {it:The Probability Lifesaver: All the Tools You Need to Understand Chance.} Princeton, NJ: Princeton University Press. {p 4 8 2} Moktefi, A. and S.-J. Shin. 2012. A history of logic diagrams. In Gabbay, D.M., F.J. Pelletier and J. Woods (eds) {it:Handbook of the History of Logic. Volume 11: Logic: A History of Its Central Concepts.} Amsterdam: North-Holland, 611{c -}682. {p 4 8 2} Mollerup, P. 2015. {it:Data Design: Visualizing Quantities, Locations, Connections.} London: Bloomsbury. {p 4 8 2} Okabe, M. and K. Ito. 2008. Color Universal Design (CUD): How to make figures and presentations that are friendly to colorblind people. {browse "http://jfly.uni-koeln.de/color/":http://jfly.uni-koeln.de/color/} {p 4 8 2} Over, M. 2022. pvenn2: Proportional Venn diagram, enhanced version of pvenn. {browse "http://digital.cgdev.org/doc/stata/MO/Misc":http://digital.cgdev.org/doc/stata/MO/Misc} [accessed 20 September 2022] {p 4 8 2} Pitman, J. 1993. {it:Probability.} New York: Springer. {p 4 8 2} Playfair, W. 1801. {it:The Statistical Breviary.} London: Wallis etc. {p 4 8 2} Sandifer, C.E. 2007. {it:How Euler Did It.} Washington, DC: Mathematical Association of America. {p 4 8 2} Sandifer, C.E. 2008. Leonhard Euler. In Gowers, T. (ed.) {it:The Princeton Companion to Mathematics.} Princeton, NJ: Princeton University Press, 747{c -}749. {p 4 8 2} Schnable, P.S. and many co-authors. 2009. The B73 maize genome: complexity, diversity, and dynamics. {it:Science} 326: 1112{c -}1115. http://www.jstor.org/stable/27736489 {p 4 8 2} Stephenson, N. 2003. {it:Quicksilver: Volume One of the Baroque Cycle.} New York: William Morrow. {p 4 8 2} Stewart, I. 1975. {it:Concepts of Modern Mathematics.} Harmondsworth: Penguin. {p 4 8 2} Stigler, S.M. 1980. Stigler's law of eponymy. {it:Transactions of the New York Academy of Sciences} 39: 147{c -}158. doi:10.1111/j.2164-0947.1980.tb02775.x {p 4 8 2} Stigler, S.M. 1999. {it:Statistics on the Table: The History of Statistical Concepts and Methods.} Cambridge, MA: Harvard University Press. {p 4 8 2} Stillwell, J. 2010. {it:Mathematics and Its History.} New York: Springer. {p 4 8 2} Strickland, L. and H.R. Lewis. 2022. {it:Leibniz on Binary: The Invention of Computer Arithmetic.} Cambridge, MA: MIT Press. {p 4 8 2} Unwin, A. 2015. {it:Graphical Data Analysis with R.} Boca Raton, FL: CRC Press. {p 4 8 2} Unwin, A. 2024. {it:Getting (more out of) Graphics.} Boca Raton, FL: CRC Press. {p 4 8 2} Venn, J. 1866, 1876, 1888. {it:The Logic of Chance.} London: Macmillan. {p 4 8 2} Venn, J. 1880a. On the forms of logical proposition. {it:Mind} 5(19): 336{c -}349. {p 4 8 2} Venn, J. 1880b. On the diagrammatic and mechanical representation of propositions and reasonings. {it:The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science} Series 5, 10 (59): 1{c -}18. doi:10.1080/14786448008626877. {p 4 8 2} Venn, J. 1880c. On the employment of geometrical diagrams for the sensible representation of logical propositions. {it:Transactions of the Cambridge Philosophical Society} 4: 47{c -}59. {p 4 8 2} Venn, J. 1881, 1894. {it:Symbolic Logic.} London: Macmillan. {p 4 8 2} Venn, J. 1891. On the nature and uses of averages. {it:Journal of the Royal Statistical Society} 52: 429{c -}456. {p 4 8 2} Verburgt, L.M. 2022. {it:John Venn: A Life in Logic.} Chicago: University of Chicago Press. {p 4 8 2} Whittle, P. 2000. {it:Probability via Expectation.} New York: Springer. {p 4 8 2} Wilke, C.O. 2019. {it:Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures.} Sebastopol, CA: O'Reilly. {p 4 8 2} Wilkinson, L. 2012. Exact and approximate area-proportional circular Venn and Euler diagrams. {it:IEEE Transactions on Visualization and Computer Graphics} 18: 321{c -}331. doi:10.1109/TVCG.2011.56 {p 4 8 2} Wong, B. 2011. Color blindness. {it:Nature Methods} 8: 441. {browse "https://www.nature.com/articles/nmeth.1618.pdf":https://www.nature.com/articles/nmeth.1618.pdf} {p 4 8 2} Youschkevitch, A.P. 1971. Euler, Leonhard. In Gillispie, C.C. (ed.} {it:Dictionary of Scientific Biography.} New York: Charles Scribner's Sons 4: 467{c -}484. {p 4 8 2} Zeitz, P. 2007. {it:The Art and Craft of Problem Solving.} Hoboken, NJ: John Wiley. p.147 cardinality {title:Bibliographic note on Martin Gardner's columns} {p 4 4 2} Martin Gardner's columns on "Mathematical Games" over many years in {it:Scientific American} covered much more than games and puzzles and included many splendid expositions of topics with mathematical content. They present a variety of small bibliographical challenges. The original articles will be accessible to many readers at jstor.org, but typically under the titles "Mathematical Games". A further tiny detail is that pagination starts afresh in each issue of {it:Scientific American}, so volume and issue number together are needed for an exact citation. The columns were collected later in book form, often revised and/or retitled, in books that themselves often varied in publisher and even title over various reprints and reissues. A project to publish further revised editions, under yet other titles, from Cambridge University Press and the Mathematical Association of America, released its first four volumes between 2008 and 2014, but appears to have stalled. At the time of writing it had not reached the books mentioned here. {p 4 4 2} {browse "https://en.wikipedia.org/wiki/List_of_Martin_Gardner_Mathematical_Games_columns":https://en.wikipedia.org/wiki/List_of_Martin_Gardner_Mathematical_Games_columns} and {browse "https://ansible.uk/misc/mgardner.html":https://ansible.uk/misc/mgardner.html} will help you find what you are looking for or indeed to determine whether a relevant column was ever written. `