{smcl} {* *! version 1.5.0 26Okt2011}{...} {* *! version 1.4.0 20Nov2009}{...} {* *! version 1.2.3 23Feb2008}{...} {* *! version 1.2.2 23Dec2007}{...} {* *! version 1.2.1 18Dec2007}{...} {* *! version 1.2.0 29Nov2007}{...} {* *! version 1.1.0 10Nov2007}{...} {* *! version 1.0.0 04Nov2007}{...} {cmd:help hangroot} {hline} {title:Title} {p2colset 5 17 19 2}{...} {p2col :{hi:hangroot} {hline 2}}Hanging rootogram or suspended rootogram comparing an empirical distribution to a theoretcal distribution{p_end} {p2colreset}{...} {title:Syntax} {phang} Stand-alone {p 8 12 2} {cmd:hangroot} {it:varname} {ifin} [{help weights:fweight}] [{cmd:,} {opt dist(name)} {it:{help hangroot##options:general_opts}} {c -(}{it:{help hangroot##continuous_opts:continuous_opts}} {c |} {it:{help hangroot##discrete_opts:discrete_opts}}{c )-} ] {phang} post-estimation command {p 8 12 2} {cmd:hangroot} [{cmd:,} {it:{help hangroot##options:general_opts}} {c -(}{it:{help hangroot##continuous_opts:continuous_opts}} {c |} {it:{help hangroot##discrete_opts:discrete_opts}}{c )-} ] {pstd} The post-estimation syntax is available after estimating a model with or without covariates using one of the commands listed in {help hangroot##coef:this} table. {marker options}{...} {it:general_opts}{col 33}description {hline 67} {opt dist(name)}{...} {col 33}specifies theoretical distribution {opt par(numlist)}{...} {col 33}specifies the parameters at which the {col 33}theoretical distribution is to be fixed {col 33}These parameters will be estimated if this {col 33}option is not specified {opt susp:ended}{...} {col 33}specifies that a suspended rootogram is to {col 33}be drawn rather than a hanging rootogram {opt notheor:etical}{...} {col 33}supresses the disply of the theoretical {col 33}distribution {opt ci}{...} {col 33}draw confidence intervals {opt l:evel(#)}{...} {col 33}sets confidence level to {it:#} {opt sims(varlist)}{...} {col 33}overlays the empirical distributions of {col 33}all variables in {it:varlist} {opt sp:ike}{...} {col 33}draw the empirical distribution as spikes, {col 33}the default {opt bar}{...} {col 33}draw the empirical distribution as bars {opt ninter(#)}{...} {col 33}governs the smoothness of the theoretical {col 33}distribution if {cmd:hangroot} is used {col 33}after an estimation command with covariates {opt maino:pt(graph_options)}{...} {col 33}options governing the look of the {col 33}empirical distribution {opt theoro:pt(graph_options)}{...} {col 33}options governing the look of the {col 33}theoretical distribution {opt cio:pt(graph_options)}{...} {col 33}options governing the look of the {col 33}confidence interval {opt simsopt(graph_options)}{...} {col 33}options governing the look of the {col 33}empirical distributions specified in {opt sims()} {opt jitter:sims(integer)}{...} {col 33}horizontally jitter the marker positions for the {col 33}empircal distribution specified in {opt sims()} {col 33}using random noise {opt jitterseed()}{...} {col 33}random number seed for {opt jittersims()} {opt plot(plot)}{...} {col 33}add other plots to the graph {help hangroot##other:other options} {hline 67} {marker continuous_opts}{...} {it:continuous_opts}{col 33}description {hline 67} {opt bin(#)}{...} {col 33}set number of bins to {it:#} {opt w:idth(#)}{...} {col 33}set width of bins to {it:#} {opt start(#)}{...} {col 33}set lower limit of first bin to {it:#} {hline 67} {marker discrete_opts}{...} {it:discrete_opts}{col 33}description {hline 67} {opt d:iscrete}{...} {col 33}specify that the data are discrete {opt w:idth(#)}{...} {col 33}set width of bins to {it:#} {opt start(#)}{...} {col 33}set theoretical minimum value to {it:#} {hline 67} {title:Description} {pstd} {cmd:hangroot} draws a hanging rootogram or suspended rootogram (Tukey 1965, 1972, 1977) and (Tukey and Wilk 1965) (Also see: (Wainer 1974) and (Friendly 2000)) comparing the empirical distribution of varname to a theoretical distribution, as specified in the {opt dist} option. Both are an alternative for {cmd:histogram} with the theoretical density function plotted on top. The hanging rootogram differs from a histogram in two ways: {pmore} 1) The spikes or bars "hang" from the theoretical distribution instead of "standing" on the x-axis. The deviations are now shown as deviations from a horzontal line (y=0) instead of deviations from a curve (the density function). This makes it easier to spot patterns in the deviations. {pmore} 2) Instead of showing the freqencies it shows the square root of the frequencies. This way the sampling variation of the length of the spikes or bars is stabelized. These lengths are counts of the number of observations that fall within each bin, and larger counts tend to have larger sampling variation than smaller counts, making it harder to compare the deviations across bins. By taking the square root, the sampling variations tends to be approximately equal across bins, facilitating the comparison across bins. Moreover, this tends to make deviations in the tails, where the counts are small, more visible. {pstd} The aim of the hanging rootogram and the suspended rootogram is to compare an empirical distribution to a theortical distribution. The key part of the graph that displays this information are the deviations of the bars or spikes in the hanging rootogram from the line y=0, as these are the residuals of the empirical distribution fromt the theoretical distribution. So, a more direct way of achieving the goal of these graphs is to directly display these residuals rather than the raw number of observations belonging to each bin. This is what the suspended rootogram does. It now makes sense to flip the entire graph upside down, suspending the theoretical distribution from the line y=0, because this way positive residuals represent too many observations in a bin, and negative residuals too few observations in a bin. We can optionally suppress the display of the theoretical distribution, focussing entirly on the residuals. {title:Coefficients of the theoretical distribution} {pstd} The parameters of the theoretical can either be estimated or specified by the user using the {cmd:par()} option. For many distributions there are two ways in which {cmd:hangroot} can obtain estimates, either it computes those estimates itself, this happens when the stand-alone syntax is used, or it can use the estimates obtained previously, this happens when the post-estimation syntax is used. The post-estimation syntax is available if in the table below there is an entry in the {it:estimation command} column, while the stand-alone syntax is available if the entry in the {opt dist(name)} column is not marked with an *. {pstd} In order to use the post-estimation syntax one must first estimate the model with or without covariates using one of the estimation commands in the table below. If the previous model contained covariates, then the theoretical distribution will the marginal distribution of the dependent variable implied by the model and distribution of the covariates in the data. All the estimation commands are either part of official Stata or available from {help ssc}. The {cmd:if} and {cmd:in} qualifiers and the weights will be coppied from the last estimation command, so they may not be specified in the post-estimation syntax. This also means that this syntax is only availabe if the previous model was estimated either without weights or with fweights. {marker coef}{...} {it:distribution}{col 37}{it:estimation command}{col 58}{opt dist(name)} {hline 67} normal / Gaussian{...} {col 37}{cmd:regress}{...} {col 58}{it:{ul:norm}al} or {it:{ul:gaus}sian} lognormal{...} {col 37}{cmd:lognfit}{...} {col 58}{it:{ul:logn}ormal} logistic{...} {col 58}{it:{ul:logi}stic} Weibull{...} {col 37}{cmd:weibullfit}{...} {col 58}{it:{ul:weib}ull}* Chi square{...} {col 58}{it:chi2} gamma{...} {col 37}{cmd:gammafit}{...} {col 58}{it:gamma} Gumbel{...} {col 37}{cmd:gumbelfit}{...} {col 58}{it:{ul:gumb}el} inverse gamma{...} {col 37}{cmd:invgammafit}{...} {col 58}{it:{ul:invg}amma} Wald / inverse Gaussian{...} {col 37}{cmd:invgaussfit}{...} {col 58}{it:wald} beta{...} {col 37}{cmd:betafit}{...} {col 58}{it:beta} Pareto{...} {col 37}{cmd:paretofit}{...} {col 58}{it:{ul:pare}to} Fisk / log-logistic{...} {col 37}{cmd:fiskfit}{...} {col 58}{it:fisk}* Dagum{...} {col 37}{cmd:dagumfit}{...} {col 58}{it:dagum}* Singh-Maddala{...} {col 37}{cmd:smfit}{...} {col 58}{it:sm}* Generalized Beta II{...} {col 37}{cmd:gb2fit}{...} {col 58}{it:gb2}* generalized extreme value{...} {col 37}{cmd:gevfit}{...} {col 58}{it:gev}* exponential{...} {col 58}{it:{ul:expo}nential} Laplace{...} {col 58}{it:{ul:lapl}ace} uniform{...} {col 58}{it:{ul:unif}orm} geometric{...} {col 58}{it:{ul:geom}etric} Poisson{...} {col 37}{cmd:poisson}{...} {col 58}{it:{ul:pois}son} zero inflated Poisson{...} {col 37}{cmd:zip}{...} {col 58}{it:zip}* negative binomial I{...} {col 37}{cmd:nbreg}{...} {col 58}{it:nb1}* negative binomial II{...} {col 37}{cmd:nbreg} and {cmd:gnbreg}{...} {col 58}{it:nb2}* zero inflated negative binomial{...} {col 37}{cmd:zinb}{...} {col 58}{it:zinb}* {hline 67} * These distributions cannot be used in the stand-alone syntax. {pstd} If the post-estimation syntax is used and the {cmd:par()} option is not specified, than the best fitting distribution is retrieved from the last estimation command. If the stand-alone syntax is used and the {cmd:par()} option is not specified, than the best fitting distribution is fitted using the method specified in the table below. If the method is not maximum likelihood, the best fitting distribution in the stand-alone syntax may differ from the post-estimation syntax. {it:maximum} {col 26}{it:method of} {it:likelihood} {col 26}{it:moments} {hline 41} Gaussian {col 26}logistic log-normal {col 26}Gumbel Wald {col 26}inverse gamma Pareto {col 26}beta exponential {col 26}uniform Laplace geometric Poisson gamma(*) {hline 41} (*) approximate {pstd} Depending on the distribution specified, {cmd:hangroot} will only use observations that meet the following criteria:{p_end} {col 9}exponential {col 25}{it:varname} >= 0 {col 9}lognormal {col 25}{it:varname} >= 0 {col 9}Weibull {col 25}{it:varname} >= 0 {col 9}gamma {col 25}{it:varname} >= 0 {col 9}Gumbel {col 25}{it:varname} >= 0 {col 9}inverse gamma {col 25}{it:varname} > 0 {col 9}Pareto {col 25}{it:varname} > 0 {col 9}wald {col 25}{it:varname} > 0 {col 9}Fisk {col 25}{it:varname} > 0 {col 9}Dagum {col 25}{it:varname} > 0 {col 9}Singh-Maddala {col 25}{it:varname} > 0 {col 9}Generalized Beta II {col 25}{it:varname} > 0 {col 9}Poisson {col 25}{it:varname} >= 0 & {it:varname} = integer {col 9}zero inflated Poisson {col 25}{it:varname} >= 0 & {it:varname} = integer {col 9}negative binomial {col 25}{it:varname} >= 0 & {it:varname} = integer {col 9}zero inflated negative {col 9}binomial {col 25}{it:varname} >= 0 & {it:varname} = integer {col 9}geometric {col 25}{it:varname} >= 0 & {it:varname} = integer {col 9}beta {col 25}0 < {it:varname} < 1 {pstd} The geometric distribution is parameterized in terms of the number of failures before the first succes, instead of the number of trials needed to get the first succes. {title:General options} {phang} {opt dist(name)} specifies the theoretical distribution with which the empirical distribution is compared. This option will be ignored in the post-estimation syntax. The default is Gaussian. {pmore} The option {opt discrete} is implied when specifying the geometric or the poisson distribution. {phang} {opt par(numlist)} specifies the parameters at which the theoretical distribution is to be fixed. If this option is not specified than the estimated parematers will be used. The table below identifies which parameter is represented by which number in the {it:numlist}. The variable of interest is represented by y, the first number in numlist is represented by a, the second by b, the third by c, and the fourth by d. {hline 68} {it:distribution}{col 37}{it:parameterization} {hline 68} normal / Gaussian{...} {col 37}{help normalden}(a, b) lognormal{...} {col 37}(1 / (y * b * sqrt(2 * pi))) * {col 37}exp(-(log(y) - a)^2 / (2 * b^2)) logistic{...} {col 37}exp(-1*(y - a )/b) / {col 37}(b*(1+exp(-1*(y - a )/b))^2) Weibull{...} {col 37}(a/b)*(y/b)^(a - 1)*exp(-(y/b)^a) Chi square{...} {col 37}{help gammaden}(a/2,2,0,y) gamma{...} {col 37}{help gammaden}(a, b, 0, y) Gumbel{...} {col 37}(1 / b) * exp(-(y - a) / b) * {col 37}exp(-exp(-(y - a) / b)) inverse gamma{...} {col 37}b^a/exp({help lngamma}(a))*y^(-a-1)*exp(-b/y) Wald / inverse Gaussian{...} {col 37}sqrt(b/(2*pi*y^3)) * {col 37}exp(-b*(y-a)^2 / (2*a^2*y)) beta{...} {col 37}{help betaden}(a,b,y) Pareto{...} {col 37}b*a^b/y^(b+1) Fisk / log-logistic{...} {col 37}a*((b/y)^a)*(1/<)/(1 + (b/y)^a)^2 Dagum{...} {col 37}(a*c)*((b/y)^a)* {col 37}(1/y)/(1 + (b/y)^a)^(c+1) Singh-Maddala{...} {col 37}(a*c/b)*((1 + (y/b)^a)^-(c+1))* {col 37}((y/b)^(a-1)) Generalized Beta II{...} {col 37}a*y^(a*c-1)*((b^(a*c))* {col 37}exp({help lngamma}(c) + {help lngamma}(d) - {col 41}{help lngamma}(c + d))* {col 37}(1 + (y/b)^a )^(c+d))^-1 generalized extreme value{...} {col 37}1/b*(1+c*((y-a)/b))^(-1-1/c)* {col 37}exp(-1*(1+c*((y-a)/b))^(-1/c)) exponential{...} {col 37}a*exp(-a*y) Laplace{...} {col 37}1/(2*b)*exp(-1*|y-a|/b) uniform{...} {col 37}1/(b-a) geometric{...} {col 37}(1-a)^y*a Poisson{...} {col 37}exp(-a)*a^y/y! zip{...} {col 37}{help cond}(y==0, b + (1-b)*exp(-a), {col 37}(1-b)*( exp(-a)*a^y/y! ) negative binomial I{...} {col 37}exp({help lngamma}(y + a) - {help lngamma}(y+1) - {col 41}{help lngamma}(a)) * b^y / (1 + b)^(y+a) negative binomial II{...} {col 37}exp({help lngamma}(y + 1/b) - {help lngamma}(y + 1) - {col 41}{help lngamma}(1/b)) * (a/(1/b + a))^y * {col 37}(1/(b*(1/b + a)))^(1/b) zero inflated negative {...} {col 37}{help cond}(y==0 , c + (1-c)*(1/(1+a*b))^(1/b), binomial{...} {col 37}(1-c) * {col 37}exp({help lngamma}(1/b+y) - {help lngamma}(y+1) - {col 41}{help lngamma}(1/b)) * {col 37}(1/(1+a*b))^(1/b) * (1-1/(1+a*b))^y ) {hline 68} {phang} {opt susp:ended} specifies that a suspended rootogram is drawn rather than a hanging rootogram. {phang} {opt notheor:etical} suppresses the display of the theoretical curve. This option is only allowed in combination with the {opt suspended} option. {phang} {opt ci} specifies that confidence intervals are drawn around the bottom of the spikes or the bars. These confidence intervals assume that the number of observations in a bin follow a multinomial distribution, and use Goodman's (1965) approximation of the simultaneous confidence interval. These confidence intervals do not take into account that the parameters in the theoretical distribution are also estimated. These confidence intervals also do not take into account that nearby bins are likely to be similar, as was suggested by Vermeesch (2005). However, I would consider this latter point a feature, as this corresponds with the simple non-parametric logic that is behind the histogram and the (hanging) rootogram. {phang} {opt l:evel(#)} specifies the confidence level, in percent, for the confidence intervals; see {help level}. {phang} {opt sims(varlist)} specifies variables whose empirical distribution will be overlaid on top of the graph. The intended use is that these variables are a set of random samples from the theoretical distribution, thus providing an informal confidence interval. {phang} {opt sp:ike} specifies that the empirical distribution is graphed as spikes. This is the default. {phang} {opt bar} specifies that the empirical distribution is graphed as bars. {phang} {opt ninter(#)} specifies the number of points between bin-midpoints for which the theoretical distribution is computed. It governs the smoothness of the curve representing the theoretical distribution. This option is only allowed when {cmd:hangroot} is used after an estimation command with covariates. The default is 5, and can be any integer between and including 0 and 20. {phang} {opt maino:pt(graph_options)} specifies options govinging the look of the empirical distribution or the residuals are drawn. By default or when the {opt spike} option is specified, these options can be the options of {help twoway rspike}. When the {opt bar} option is specified, these options can be the options of {help twoway rbar}. In either case the the {opt by()}, {opt horizontal}, and {opt vertical} options are not allowed. The options that can be specified in {opt} can also be directly added as {help hangroot##other:other_options}. {phang} {opt theoro:pt(graph_options)} specifies options govinging the look of the theoretical distribution. These options can be the options of {help twoway line}. This option is not allowed when the {cmd:notheoretical} option is specified. {phang} {opt cio:pt(graph_options)} specifies options govinging the look of the confidence interval. These can be the options of: {pmore} {help twoway rbar} when the {opt suspended} option is not specified and the empirical distribution is represented by spikes, {pmore} {help twoway rcap} when the {opt suspended} option is not specified and the emprical distribution is represented by bars, {pmore} {help twoway rarea} when the {opt suspended} option is specified. {phang} {opt simsopt(graph_options)} specifies options govinging the look of the simulated distributions. These options can be the options of {help twoway pcspike}. This option is only allowed when the {cmd:sims()} option is specified. {phang} {opt jitter:sims(integer)} adds random noice to the vertical possition of the markers representing the simulations, where {it:integer} represents the size of the noise as a percentage of the distance between the highest and lowest marker position for the simulated variables. {phang} {opt jitterseed()} random number seed for {opt jittersims()} {phang} {opt plot(plot)} provides a way to add other plots to the generated graph; see: {help addplot_option} (Stata 9 and 10) or {help plot_option} (Stata 8). {phang} {marker other}{...} {it: Other options} When the option {opt bar} is specified all {help twoway rbar} options are allowed, except {opt by()}, {opt horizontal}, and {opt vertical}. Otherwise all {help twoway rspike} options are allowed, with the same exceptions. {title:Options for use in the continuous case} {phang} {opt bin(#)} and {opt width(#)} are alternatives. They specify how the data are to be aggregated into bins; {opt bin()} by specifying the number of bins (from which the width can be derived) and {opt width()} by specifying the bin width (from which the number of bins can be derived). {pmore} If neither option is specified, results are the same as if {opt bin(k)} were specified, where {phang3} {it:k} = min{c -(}sqrt({it:N}), 10*ln({it:N})/ln(10){c )-} {pmore} and where {it:N} is the number of observations. {phang} {opt start(#)} specifies the theoretical minimum of varname. The default is {opt start(m)}, where {it:m} is the observed minimum value of {it:varname}. {pmore} Specify {opt start()} when you are concerned about sparse data, for instance, if you know that {it:varname} can have a value of 0, but you are concerned that 0 may not be observed. {pmore} {opt start(#)}, if specified, must be less than or equal to {it:m}, or else an error will be issued. {title:Options for use in the discrete case} {phang} {opt discrete} specifies that varname is discrete and that you want each unique value of {it:varname} to have its own bin (bar of histogram). {phang} {opt width(#)} is rarely specified in the discrete case; it specifies the width of the bins. The default is {opt width(d)}, where {it:d} is the observed minimum difference between the unique values of {it:varname}. {pmore} Specify {opt width()} if you are concerned that your data are sparse. For example, in theory {it:varname} could take on the values, say, 1, 2, 3, ..., 9, but because of the sparseness, perhaps only the values 2, 4, 7, and 8 are observed. Here the default width calculation would produce {cmd:width(2)} and you would want to specify {cmd:width(1)}. {phang} {opt start(#)} is also rarely specified in the discrete case; it specifies the theoretical minimum value of varname. The default is {opt start(m)}, where {it:m} is the observed minimum value. {pmore} As with {opt width()}, you specify {opt start(#)} if you are concerned that your data are sparse. In the previous example, you might also want to specify {cmd:start(1)}. {opt start()} does nothing more than add white space to the left side of the graph. {pmore} The value of {it:#} in {opt start()} must be less than or equal to {it:m}, or an error will be issued. {title:Examples} {pstd} The residuals after a linear regression {help regress} should be normally distributed, but in this case it appears to follow a bimodal distribution. {cmd}{...} sysuse nlsw88, clear gen ln_w = ln(wage) reg ln_w grade age ttl_exp tenure predict resid, resid hangroot resid {txt}{...} {p 4 4 2}({stata "hangroot_ex 1":click to run}){p_end} {pstd} This bimodal distribution appears to be the result of the ommision of the variable union. {cmd}{...} sysuse nlsw88, clear gen ln_w = ln(wage) reg ln_w grade age ttl_exp tenure union predict resid2, resid hangroot resid2 {txt}{...} {p 4 4 2}({stata "hangroot_ex 2a":click to run}){p_end} {pstd} The part of the graph that tells us how wel the distribution fits to the data is the distance between the bottom of the spikes and the horizontal line y=0. So why not explicitly plot these residuals instead? When we do that, it would also make sense to flip the entire graph upside down: In that case bins with too many cases will receive positive residuals and bins with too few cases negative residuals. This is done with the {opt susp} option {cmd}{...} hangroot resid2, ci susp theoropt(lpattern(-)) {txt}{...} {p 4 4 2}({stata "hangroot_ex 2b":click to run}){p_end} {pstd} One can focuss more on the residuals by removing the theoretical distribution. {cmd}{...} hangroot resid2, ci susp notheor {txt}{...} {p 4 4 2}({stata "hangroot_ex 2c":click to run}){p_end} {pstd} {cmd:hangroot} can be used as a post-estimation command. In this case a log normal distribution without covariates was fitted using {cmd:lognfit}, which is available from {help ssc}. {cmd}{...} sysuse nlsw88, clear lognfit wage hangroot, ci {txt}{...} {p 4 4 2}({stata "hangroot_ex 3a":click to run}){p_end} {pstd} A hanging rootogram can also be used to compare the distribution of two a variable across two groups. In the example below the wage distribution of those with a college degree is the reference/"theoretical" distribution, and the wage distribution of the respondents without a college degree hangs from it. {cmd}{...} sysuse nlsw88, clear hangroot wage, dist(theoretical collgrad) {txt}{...} {p 4 4 2}({stata "hangroot_ex 4":click to run}){p_end} {pstd} {cmd:hangroot} can also be used to compare an empirical distribution with the marginal distribution implied by a regression model. In this case we create data that is appropriate for linear regression, but the distribution of y looks nothing like a normal distribution. {pstd} In Stata >= 10 I would have used {help rnormal}() instead of {cmd:invnormal(uniform())}, but I have not done so here since {cmd:hangroot} is also supposed to work in Stata 9.2. {cmd}{...} drop _all set obs 1000 gen byte x = _n <= 250 gen y = -3 + 3*x + invnormal(uniform()) hangroot y, dist(normal) {txt}{...} {p 4 4 2}({stata "hangroot_ex 5a":click to run}){p_end} {pstd} However, the distribution of y fits the marginal distribution implied by the regression model. In this case we could also have inspected the residuals, which should be normally distributed. However, looking at residuals won't work for models that imply other distribution, e.g. Poisson or beta regression, but in those cases one can still inspect the marginal distribution. {cmd}{...} reg y x hangroot {txt}{...} {p 4 4 2}({stata "hangroot_ex 5b":click to run}){p_end} {pstd} Some deviation from the theoretical distribution is expected, as the data are typically random draws from a larger population. A nice way to see what kind of variability is still consistent with the model is to create a couple variables that are random draws assuming that the model is correct, and overlay the distribution of these random draws on top of the hanging rootogram. {pstd} In Stata >= 10 I would have used {help rnormal}(mu,sd) instead of {cmd:invnormal(uniform())*sd + mu}, but I have not done so here since {cmd:hangroot} is also supposed to work in Stata 9.2. {cmd}{...} predict double mu , xb scalar sd = e(rmse) forvalues i = 1/20 { gen sim`i' = invnormal(uniform())*sd + mu } hangroot, sims(sim*) jitter(5) xlab(-6(3)3) {txt}{...} {p 4 4 2}({stata "hangroot_ex 5c":click to run}){p_end} {title:Author} {p 4 4} Maarten L. Buis{break} Universitaet Tuebingen{break} Institut fuer Soziologie{break} maarten.buis@uni-tuebingen.de {p_end} {title:Acknowledgement} {phang} Several programming tricks from {help dpplot} by Nick Cox are incorporated in this program. {title:References} {phang} Friendly, M. 2000. Visualizing categorical data. Cary, NC: SAS Institute. {phang} Goodman, L.A. 1965, On Simultaneous Confidence Intervals for Multinomial Proportions. {it:Technometrics}, 7(2), pp. 247-254. {phang} Tukey, J.W. 1965. The future of processes of data analysis. Reprinted in Jones, L.V. (ed.) 1986. The collected works of John W. Tukey. Volume IV: Philosophy and principles of data analysis: 1965-1986. Monterey, CA: Wadsworth and Brooks/Cole, 517-547. {phang} Tukey, J.W. and Wilk, M.B. 1965. Data analysis and statistics: principles and practice. Reprinted in Cleveland, W.S. (ed.) 1988. The collected works of John W. Tukey. Volume V: Graphics: 1965-1985. Pacific Grove, CA: Wadsworth and Brooks/Cole, 23-29. {phang} Tukey, J.W. 1972. Some graphic and semigraphic displays. In Bancroft, T.A. and Brown, S.A. (eds) Statistical papers in honor of George W. Snedecor. Ames, IA: Iowa State University Press, 293-316. {phang} Tukey, J.W. 1977, {it:Exploratory Data Analysis}, Addison-Wesley. {phang} Vermeesch, P. 2005, Statistical uncertainty associated with histograms in the Earth Sciences, {it:Journal of Geophysical Research - Solid Earth}, Vol 110, B02211. {phang} Wainer, H. 1974, The Suspended Rootogram and Other Visual Displays: An Empirical Validation. {it: The American Statistician}, 28(4), pp. 143-145. {title:Also see:} Estimation commands: {p 4 4} If installed: {help lognfit}, {help weibullfit}, {help gammafit}, {help gumbelfit}, {help invgammafit}, {help invgaussfit}, {help betafit}, {help paretofit}, {help fiskfit}, {help dagumfit}, {help smfit}, {help gb2fit} {help gevfit} Alternatives: {p 4 4} Online: {help spikeplot}, {help histogram}, {help qnorm}, {help pnorm}, {help pchi}, and {help qchi}. {p 4 4} if installed: {help dpplot}, {help pbeta}, {help qbeta}, {help pweibull}, {help qweibull}, {help plogn}, {help qlogn}, {help pgamma}, {help qgamma}, {help pgumbel}, {help qgumbel}, {help pinvgauss}, {help qinvgauss}, {help pinvgamma}, {help qinvgamma}, {help pdagum}, {help qdagum}, {help pgb2}, {help qgb2}, {help psm}, {help qsm}