{*.* !22 January 2010}help sixplot------------------------------------------------------------------------------- Title

Syntax

sixplotvarlist[if] [in]

Description

sixplotdisplays six diagnostic and descriptive graphs for a single variable formatted as a 2 row, 3 column array. The arguments arevarnameandsequence variable. If nosequence variableis named, the program plotsvarnameversus the sequence the data are stored in.The plot in the (1,1) position is a

sequence plotofvarnameversus the sequence.The plot in the (1,2) position is a

residual versus fitted plotof the regression ofvarnameversus sequence.The plot in the (1,3) position is a

boxplotofvarname.The plot in the (2,1) position is a

first difference plotofvarnameversus sequence.The plot in the (2,2) position is a

histogramofvarname.The plot in the (2,3) position is a

normal quantile plotofvarname.The default is to conduct these analyses for all observations in the data set in the order they are recorded. If you sort the data, the analysis will be conducted on that order.

The sequence plot allows you to examine the data for drift over the sequence (presumably time). This graph also displays the linear fit line and a 95% forecast interval. Observations outside the shaded line are candidates for inspection as outliers. If you plot more than 300 observations, the plot is blurred and I suggest you use batches of 300.

The rvfplot displays the residuals versus the fitted values and allows you to check for outliers and patterns such as unequal variance over fitted values. Clear patterns suggest you should look closely at your model. It also displays limits of 2*rmse as a guide.

The boxplot shows quartiles and outliers.

The first difference plot checks for changes in the data.

The histogram provides a picture of the distribution of

varname. It has 10 bins, which you may wish to change in further analysis. It should be roughly symmetric if the data are normal. Do not get overly concerned with apparent departures from symmetry if your data set is small.The normal quantile plot gives a graphical diagnostic of normality. If the plot suggests non-normality, there may be concern about the validity of procedures such as confidence intervals.

Caution:If the data set is large, the sequence plot and the first difference plot may be blurred and difficult to interpret. We suggest examining the data in batches of 300 or so using thein 1/300option. Sixplot does not superimpose a normal plot on the histogram.

Examples----------------------------------------------------------------------- Setup

. sysuse uslifeexp.dta. sixplot le_male. sixplot le_femaleThe data set gives life expectancy by sex and race from 1900 to 1999. The above commands provide a sixplot for these years.

Setup

. sysuse nlsw88.dta. sixplot wageThis data has over 2000 observations and blurs the information on the plots. There is no obvious time relation here.

. sixplot wage in 1/300The "in" restriction can be repeated as 301/600, etc.

NotesThese plots were cited in Good and Hardin's book as the fourplot. I have added the rvfplot and boxplot. This seems to have originated in the

Engineering Statistics Handbooksection 4.4.5.3, from NIST (available online from www.itl.nist.gov/div898/handbook/)

AuthorPeter. A. Lachenbruch, Oregon State University, Corvallis, peter.lachenbruch@oregonstate.edu

AcknowledgementI thank Nick Cox and Vince Wiggns who made several useful and important suggestions to improve the ado file.

ReferencesGood, P.I. and Hardin, J. W. (2009)

Common Errors in Statistics (and howto avoid them} New York: WileyNIST

Engineering Statistics Handbookdownloaded Jan 4, 2010 - see section 4.4.5.3