------------------------------------------------------------------------------- help forparplot-------------------------------------------------------------------------------

Parallel coordinates plots

parplotvarlist[ifexp] [inrange] [,by(byvar[,suboptions])horizontalover(varname)transform(transform)variablelabelsplot(plot)addplot(plot)graph_options]

Description

parplotproduces a parallel coordinates plot ofvarlist. Each variable is plotted on a separate vertical or horizontal scale and the values for each observation are shown by connected line segments. An observation will be ignored if it has missing values for any variable invarlist(and by default, ifby()is specified, forbyvar).Such plots have a long history under various guises. Wegman (1990) gave a definitive account for a statistical readership. Cooke and van Noortwijk (2000) discuss their use, under the name cobweb plots, in sensitivity analysis. Robbins (2005) gives examples in an introductory text. Andrienko and Andrienko (2005) give further examples and extensions.

Options

by()specifies that a separate plot should be drawn for each value ofbyvar. See help on by_option and note, among other possibilities, thesuboptionstotalandmissing.

horizontaldraws variable scales horizontally. The default is vertical.

over()specifies a variable to be used to identify different categories. Different pens will be used for different categories, and it is possible to specify different marker symbols, line patterns, and so forth.

transform()specifies a transformation to be applied to each variable before plotting. Each transformation may be specified by as little as one letter,m,c,sorr.

maxminspecifies transforming to (value - minimum) / (maximum - minimum), which is the default. Values shown thus vary from 0 to 1.

centeredorcentredspecifies transforming to (value - median) / max(maximum - median, median - minimum). Each median thus is shown at 0 and values shown vary from (possibly) -1 to (possibly) 1. Note that transformed values for any given variable will attain both -1 and 1 if and only if maximum - median = median - minimum. This transform was used by Gleason (1996).

standardizedorstandardisedspecifies transforming to (value - mean) / SD. Each mean thus is shown at 0.

rawspecifies no transform, i.e. data are shown as supplied. This may be a good choice for variables expressed in the same units.

variablelabelsspecifies that multiple variables be labelled by their variable labels. The default is to use variable names.

plot(plot)provides a way to add other plots to the generated graph; see help plot option. (Stata 8 only.)

addplot(plot)provides a way to add other plots to the generated graph; see help addplot option. (Stata 9 up.)

graph_optionsare options of twoway connected.

Examples

. sysuse census, clear

. foreach v in death divorce marriage {. gen r_`v' = log10(`v' / pop). }

. foreach t in maxmin centred standardised raw {. parplot r_* , tr(`t') by(region, caption(logarithmic scales)title(US states 1980) t1(`t' scaling)) hor yla(1 "deaths" 2"divorces" 3 "marriages", ang(h)). more. }

. sysuse auto, clear. gen gpm = 1 / mpg. parplot gpm weight disp, xsc(r(0.8 3.2)) yla(, ang(h)) over(foreign)ms(oh dh) clp(_ ".#"). parplot gpm weight disp, hor ysc(r(0.8 3.2)) yla(, ang(h))over(foreign) ms(oh dh) clp(_ ".#")

AcknowledgmentsThe

parcoordprogram written by John R. Gleason for Stata 4 (Gleason 1996) was a most valuable start for this program.Vince Wiggins made very helpful comments. Ian S. Evans supplied the Andrienko reference. Scott Merryman found a bug. Garry Anderson provoked an update of the help.

AuthorNicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk

ReferencesAndrienko, G. and N. Andrienko. 2005. Blending aggregation and selection: adapting parallel coordinates for the visualization of large datasets.

Cartographic Journal42: 49-60.Cooke, R.M. and J.M. van Noortwijk. 2000. Graphical methods. In Santelli, A., K. Chan and E.M. Scott (eds)

Sensitivity analysis.Chichester: John Wiley, 245-264.Gleason, J.R. 1996. Graphing high-dimensional data using parallel coordinates.

Stata Technical Bulletin29: 10-14 (STB Reprints5: 53-60).Robbins, N.B. 2005.

Creating More Effective Graphs.Hoboken, NJ: Wiley.Wegman, E.J. 1990. Hyperdimensional data analysis using parallel coordinates.

Journal, American Statistical Association85: 664-675.