{smcl} {* 2 November 2012}{...} {hline} help for {hi:missingplot} {hline} {title:A plot to show patterns of missing values in a dataset} {p 8 17 2} {cmd:missingplot} [{it:varlist}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {cmd:all} {cmd:labels} {cmdab:var:iablenames} {it:scatter_options}] {title:Description} {p 4 4 2}{cmd:missingplot} gives a plot showing the incidence of missing values in one or more variables in the current dataset. The horizontal axis shows observation numbers; the vertical axis shows one or more lines, one for each variable shown. Marker symbols show which values are missing. {p 4 4 2}{cmd:missingplot} treats numeric and string variables alike: what is common to both is whether the {cmd:missing()} function returns true. In the case of numeric variables no distinction is made between system missing (.) and any extended missing value .a ... .z. See {help missing} for a tutorial if desired. Users wishing to select classes of variables, for example all numeric or all string variables, may wish to use first either {help ds} or {help findname} (if installed). {p 4 4 2}{cmd:missingplot} may be useful for seeing broad patterns in the incidence of missing values, for example blocks of observations with many or all missing values or variables with many or all missing values. It may also be useful for quickly identifying fine structure or notable detail in some instances. See also {help misstable} (Stata 11 up) and {help nmissing} (if installed). {title:Remarks} {p 4 4 2}For a loosely similar plot, see Wilkinson (2005, p.487). Users of this program knowing of references to interesting earlier or similar work are encouraged to send references to the program author. {p 4 4 2}The mechanics of the plot are that each variable in the plot is represented by a single variable inside the program. There is currently a limit of 20 variables being shown in any one graph. {title:Options} {p 4 8 2}{cmd:all} specifies that all variables implied by {it:varlist} should be plotted, regardless of whether they contain missing values. The default of {cmd:missingplot} is to omit variables from the plot if they have no missing values (in the observations selected, if either {cmd:if} or {cmd:in} has been specified). Specifying {cmd:all} is more likely to trigger the limit of 20 variables shown. {p 4 8 2}{cmd:labels} specifies that marker labels be shown identifying the observation number of each missing value. In practice this will work best with a small number of missing values or a small dataset or both. Note that as above marker labels are generated by repeated calls to marker label options for each variable; thus if you wish to change away from the default you would need to specify (e.g.) {cmd:mlabcolor(blue ..)}. {p 4 8 2}{cmd:variablenames} specifies that variable names only be shown to identify variables. The default is to show variable labels if they exist, and variable names otherwise. The value of this option is usually to increase the space devoted to the graph itself. {p 4 8 2}{it:scatter_options} are options of {help scatter}. {title:Examples} {p 4 8 2}{cmd:. webuse nlsw88, clear}{p_end} {p 4 8 2}{cmd:. missingplot}{p_end} {p 4 8 2}{cmd:. missingplot, var labels}{p_end} {p 4 8 2}{cmd:. missingplot, var labels mlabcolor(blue ..)} {title:Author} {p 4 4 2}Nicholas J. Cox, Durham University, U.K.{break} n.j.cox@durham.ac.uk {title:References} {p 4 8 2} Cox, N.J. 1999. Numbers of missing and present values. {it:Stata Technical Bulletin} 49: 7{c -}8. (Software updates {it:Stata Technical Bulletin} 60: 2{c -}3; {it:Stata Journal} 3:449 and 5:607) {p 4 8 2} Cox, N.J. 2010. Finding variables. {it:Stata Journal} 10: 281{c -}296. (Software updates {it:Stata Journal} 10:691 and 12:167) {p 4 8 2} Wilkinson, L. 2005. {it:The Language of Graphics.} New York: Springer. {title:Also see} {p 4 8 2}On-line: help for {help misstable}, {help nmissing} (if installed)