Distributional diagnostic plots (lognormal distribution)
qlognorm varname [if exp] [in range] [, grid graph_options a(#) ml ]
plognorm varname [if exp] [in range] [, grid graph_options a(#) ml ]
Description
qlognorm plots the quantiles of varname against the quantiles of the corresponding lognormal distribution (Q-Q plot).
plognorm graphs a standardized lognormal probability (P-P) plot for varname.
The (two-parameter) lognormal distribution fitted corresponds to a normal distribution with the mean and standard deviation of log(varname).
Remarks
Sometimes there is interest in whether the lognormal is appropriate as a distribution model for a variable. Other times there is interest in whether the logarithm of a variable is more nearly normal than that variable itself. These are two sides of the same question. qlognorm and plognorm are commands for investigating it directly.
With official Stata, it is easy to generate a new variable which is the logarithm of a variable and then to use qnorm and pnorm to see whether that new variable is close to normal in distribution. Using qlognorm and plognorm instead has these small but distinct advantages:
1. If you do this frequently, you will need to type less; sometimes, but not always, you will decide that a log transformation is advisable.
2. Fit can be assessed graphically on both raw and transformed scales.
3. If desired, you can use a plotting position other than the i / (N + 1) wired into qnorm and pnorm.
4. If desired, you can insist on maximum likelihood estimation.
Options
grid adds grid lines at the .05, .10, .25, .50, .75, .90, and .95 quantiles when specified with qlognorm. It is equivalent to yline(.25 .5 .75) xline(.25 .5 .75) when specified with plognorm.
graph_options are any of the options allowed with graph, twoway; see help grtwoway.
a(#) specifies a family of plotting positions, defined by (i - a) / (N - 2a + 1), where i is the rank assigned to an observed value and N is the number of observed values. The default is 0.5. (Note that the default for qnorm and pnorm is 0. Choice of a is rarely material unless the sample size is very small, and then the exercise is moot whatever is done. For more on plotting positions, see http://www.stata.com/support/faqs/stat/pcrank.html.
ml specifies maximum likelihood estimation. This option is for purists only. The only difference it makes is to ensure that the standard deviation of log(varname) is calculated as the root mean square deviation from the mean. Multiplying the default standard deviation, which is that produced by summarize, by a factor of sqrt(N / (N - 1)) is rarely material unless the sample size is very small, and then the exercise is moot whatever is done.
Examples
. qnorm mpg
. qlognorm mpg . qlognorm mpg, xlog ylog
. plognorm mpg
Author
Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk
Also see
Manual: [R] diagplots, [R] summarize On-line: help for diagplots, graph