{smcl}
{* 18jul2003/21jul2004/27sept2005/9nov2005/12mar2006/27mar2006/19mar2007/1july2008}{...}
{hline}
help for {hi:parplot}
{hline}

{title:Parallel coordinates plots}

{p 8 17 2}
{cmd:parplot}
{it:varlist}
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
[
{cmd:,}
{cmd:by(}{it:byvar} [{cmd:, } {it: suboptions}]{cmd:)}
{cmdab:hor:izontal}
{* 9nov2005 identify(varname)}{...} 
{cmdab:o:ver(}{it:varname}{cmd:)} 
{cmdab:tr:ansform(}{it:transform}{cmd:)}
{cmd:variablelabels}
{cmd:plot(}{it:plot}{cmd:)}
{cmd:addplot(}{it:plot}{cmd:)}
{it:graph_options}
]


{title:Description}

{p 4 4 2}{cmd:parplot} produces a parallel coordinates plot of {it:varlist}.
Each variable is plotted on a separate vertical or horizontal scale and 
the values for each observation are shown by connected line segments.  
An observation will be ignored if it has missing values for any variable 
in {it:varlist} (and by default, if {cmd:by()} is specified, for {it:byvar}).

{p 4 4 2}Such plots have a long history under various guises. Wegman (1990)
gave a definitive account for a statistical readership.  Cooke and van
Noortwijk (2000) discuss their use, under the name cobweb plots, in sensitivity
analysis.  Robbins (2005) gives examples in an introductory text. Andrienko and
Andrienko (2005) give further examples and extensions. 


{title:Options} 

{p 4 8 2}{cmd:by()} specifies that a separate plot should be drawn 
for each value of {it:byvar}. See help on {help by_option} and
note, among other possibilities, the {it:suboptions} {cmd:total} and
{cmd:missing}. 

{p 4 8 2}{cmd:horizontal} draws variable scales horizontally. The 
default is vertical. 

{p 4 8 2}{cmd:over()} specifies a variable to be used to identify 
different categories. Different pens will be used for different categories, 
and it is possible to specify different marker symbols, line patterns, 
and so forth. 

{p 4 8 2}{cmd:transform()} specifies a transformation to be applied to 
each variable before plotting. Each transformation may be specified by 
as little as one letter, {cmd:m}, {cmd:c}, {cmd:s} or {cmd:r}. 

{p 8 8 2}{cmd:maxmin} specifies transforming to {bind:(value - minimum)} /
{bind:(maximum - minimum)}, which is the default. Values shown thus vary from 0
to 1. 
 
{p 8 8 2}{cmd:centered} or {cmd:centred} specifies transforming to 
{bind:(value - median)} / {bind:max(maximum - median, median - minimum)}. 
Each median thus is shown at 0 and values shown vary from (possibly) -1 to
(possibly) 1. Note that transformed values for any given variable will attain
both -1 and 1 if and only if maximum - median = median - minimum. This 
transform was used by Gleason (1996). 
 
{p 8 8 2}{cmd:standardized} or {cmd:standardised} specifies transforming to
{bind:(value - mean) / SD}. Each mean thus is shown at 0. 

{p 8 8 2}{cmd:raw} specifies no transform, i.e. data are shown as supplied. 
This may be a good choice for variables expressed in the same units.

{p 4 8 2}{cmd:variablelabels} specifies that multiple variables be labelled by 
their variable labels. The default is to use variable names. 

{p 4 8 2}{cmd:plot(}{it:plot}{cmd:)} provides a way to add other plots to the 
generated graph; see help {help plot_option:plot option}. (Stata 8 only.) 

{p 4 8 2}{cmd:addplot(}{it:plot}{cmd:)} provides a way to add other plots to 
the generated graph; see help {help addplot_option:addplot option}. 
(Stata 9 up.) 

{p 4 8 2}{it:graph_options} are options of 
{help twoway_connected:twoway connected}. 

       
{title:Examples} 

{p 4 8 2}{cmd:. sysuse census, clear}

{p 4 8 2}{cmd:. foreach v in death divorce marriage {c -(}}{p_end}
{p 4 8 2}{cmd:. {space 8}gen r_`v' = log10(`v' / pop)}{p_end}
{p 4 8 2}{cmd:. {c )-}}

{p 4 8 2}{cmd:. foreach t in maxmin centred standardised raw {c -(}}{p_end}
{p 4 16 2}{cmd:. {space 8}parplot r_* , tr(`t') by(region, caption(logarithmic scales) title(US states 1980) t1(`t' scaling)) hor yla(1 "deaths" 2 "divorces" 3 "marriages", ang(h))}{p_end}
{p 4 8 2}{cmd:. {space 8}more}{p_end}
{p 4 8 2}{cmd:. {c )-}} 	

{p 4 8 2}{cmd:. sysuse auto, clear}{p_end}
{p 4 8 2}{cmd:. gen gpm = 1 / mpg}{p_end}
{p 4 8 2}{cmd:. parplot gpm weight disp, xsc(r(0.8 3.2)) yla(, ang(h)) over(foreign) ms(oh dh) clp(_ ".#")}{p_end}
{p 4 8 2}{cmd:. parplot gpm weight disp, hor ysc(r(0.8 3.2)) yla(, ang(h)) over(foreign) ms(oh dh) clp(_ ".#")}


{title:Acknowledgments}

{p 4 4 2}The {cmd:parcoord} program written by John R. Gleason for
Stata 4 (Gleason 1996) was a most valuable start for this program.
	
{p 4 4 2}Vince Wiggins made very helpful comments. Ian S. Evans supplied
the Andrienko reference. Scott Merryman found a bug. Garry Anderson provoked an update of the help. 


{title:Author}

{p 4 4 2}Nicholas J. Cox, Durham University, U.K.{break} 
n.j.cox@durham.ac.uk


{title:References}

{p 4 8 2}Andrienko, G. and N. Andrienko. 2005. Blending aggregation and 
selection: adapting parallel coordinates for the visualization of large
datasets. {it:Cartographic Journal} 42: 49{c -}60. 

{p 4 8 2}Cooke, R.M. and J.M. van Noortwijk. 2000. Graphical methods.
In Santelli, A., K. Chan and E.M. Scott (eds) 
{it:Sensitivity analysis.} Chichester: John Wiley, 245{c -}264.

{p 4 8 2}Gleason, J.R. 1996. Graphing high-dimensional data using parallel 
coordinates. {it:Stata Technical Bulletin} 29: 10{c -}14 
({it:STB Reprints} 5: 53{c -}60). 

{p 4 8 2}Robbins, N.B. 2005. {it:Creating More Effective Graphs.} 
Hoboken, NJ: Wiley. 

{p 4 8 2}Wegman, E.J. 1990. Hyperdimensional data analysis using parallel 
coordinates. {it:Journal, American Statistical Association} 85: 664{c -}675.