{smcl} {* 11feb2003}{...} {hline} help for {cmd:ellip7} {right:(SSC: 20030211)} {hline} {title:Graphing confidence ellipses} {title:[An update of {cmd:ellip} for Stata 7]} {p 8 15}{cmd:ellip7} {it:yvar} [{it:xvar}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{cmd:,} {c -(}{cmd:means}|{cmd:coefs} [{cmdab:p:ool}{cmd:(}{it:#}{cmd:)}]{c )-} {cmdab:c:onstant}{cmd:(}{it:string} [{it:#}]{cmd:)} {cmdab:l:evel}{cmd:(}{it:#}{cmd:)} {cmdab:g:enerate}{cmd:(}{it:ynewvar} {it:xnewvar}{cmd:)} {cmdab:a:dd}{cmd:(}{it:yoldvar} {it:xoldvar}{cmd:)} {cmdab:nogr:aph} {cmd:replace} {cmd:evr}{cmd:(}{it:#}{cmd:)} {cmdab:np:oints}{cmd:(}{it:#}{cmd:)} {cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] {cmdab:xf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] {it:graph_options} ] {title:Description} {p}{cmd:ellip7} graphs confidence ellipses for approximately normally distributed data, and is an update of {cmd:ellip} to Stata 7. A confidence ellipse is the boundary of an elliptical joint 100(1-alpha)% confidence region for two parameters. In {cmd:ellip7}, the centering variables {it:yvar} and {it:xvar} are two data variables or the first two independent variables after an immediately preceding {cmd:regress}. If {cmd:coefs} is specified without {it:xvar}, then the _cons in {cmd:regress} is used for {it:xvar}. The boundary constant determines the size of the confidence ellipse. {title:Options} {p 0 4}{c -(}{cmd:means}|{cmd:coefs}{c )-} specifies how to center the confidence ellipse. The default, and the {cmd:means} option uses two variable means, whereas {cmd:coefs} uses the first two regression coefficients from an immediately preceding {cmd:regress}. If you restricted {cmd:regress} to a portion of the data using {cmd:if} or {cmd:in}, then you will generally want to use the same conditions with {cmd:coefs}. {p 0 4}{cmdab:p:ool}{cmd:(}{it:#}{cmd:)} displays a confidence ellipse labeled {cmd:bp} using all the data, a confidence ellipse labeled {cmd:b} using a theoretically unproblematic subset, and {it:#} lines connecting #+1 dots of fractionally pooled regression coefficients dots at 1/# intervals. {cmd:pool}{cmd:(}{cmd:)} must be used with {cmd:if} or {cmd:in}, and with {cmd:coefs}, {cmd:generate}{cmd:(}{cmd:)}, and {cmd:add}{cmd:(}{cmd:)}. {cmd:pool}{cmd:(}{cmd:)} is incompatible with {cmd:by}{cmd:(}{cmd:)}. {p 0 4}{cmdab:c:onstant}{cmd:(}{it:string} [{it:#}]{cmd:)} specifies the boundary constant as a statname and an optional #. The overall default, and the {cmd:means} default is the standard deviation ellipse with {cmd:constant}{cmd:(}{it:sd} {it:2}{cmd:)} or, squared, {cmd:constant}{cmd:(}{it:sq} {it:4}{cmd:)}. The standard deviation ellipse is a.k.a. the covariance, concentration, data, error, or inertia ellipse. With the statname {it:sq}, the confidence level in percent is {bind:(1 - exp^(-#/2)) * 100).} It is the ellipse which is the most representative of the data points without any a priori statistical assumptions concerning their origin. The default corresponds to 95% INDIVIDUAL confidence intervals, or 86% JOINT confidence intervals. {it:sd} and {it:sq} cannot be used with {cmd:level}{cmd:(}{cmd:)}. I have NOT implemented the standard deviation curve for geographical data, see Gong (2002). The {cmd:coefs} default is {cmd:constant}{cmd:(}{it:f} {it:2}{cmd:)}. The default {it:#} is 4 for {it:sq}, otherwise it is 2. Available statistics are: {p 8 22}{it:statname} {space 2} definition{p_end} {hline 51} {p 8 22}{cmd:sd} {space 8} standard deviation = #^2{p_end} {p 8 22}{space 11} cannot be used with {cmd:level}{cmd:(}{cmd:)}{p_end} {p 8 22}{cmd:sq} {space 8} squared standard deviation = sd^2 = #{p_end} {p 8 22}{space 11} cannot be used with {cmd:level}{cmd:(}{cmd:)}{p_end} {p 8 22}{cmd:tsq} {space 7} Hotelling one-sample T-squared{p_end} {p 8 22}{cmd:hotel} {space 5} same as {cmd:tsq} = #(n-1)/(n-#) * F{p_end} {p 8 22}{cmd:tsqn} {space 6} sample-adjusted tsq = tsq / n{p_end} {p 8 22}{cmd:hoteln} {space 4} same as {cmd:tsqn}{p_end} {p 8 22}{cmd:ptsqn} {space 5} Hotelling T-squared prediction or{p_end} {p 8 22}{space 11} tolerance ellipse = tsqn * (n+1) / n {p_end} {p 8 22}{cmd:photeln} {space 3} same as {cmd:ptsqn}{p_end} {p 8 22}{cmd:chisq} {space 5} Chi-squared{p_end} {p 8 22}{cmd:chisqn} {space 4} sample-adjusted Chi-squared = chisq / n{p_end} {p 8 22}{cmd:f} {space 9} F = 2F * (#,n-#){p_end} {p 8 22}{cmd:fadj} {space 6} F-adjusted = = 2F * (2,n-#){p_end} {p 8 22}{space 11} Defaults {cmd:f}{it:2} and {cmd:fadj}{it:2} are equivalent{p_end} {p 0 4}{cmdab:l:evel}{cmd:(}{it:#}{cmd:)} specifies the confidence level, in percent, for calculation of the confidence ellipse; the default {it:#} is 95. {cmd:level}{cmd:(}{cmd:)} cannot be used with {cmd:constant}{cmd:(}{it:sd}{cmd:)} and {cmd:constant}{cmd:(}{it:sq}{cmd:)}. {p 0 4}{cmdab:g:enerate}{cmd:(}{it:ynewvar} {it:xnewvar}{cmd:)} generates two new variables, {it:ynewvar} and {it:xnewvar}, which define the confidence ellipse. If the current dataset contains fewer observations than in {cmd:npoints}{cmd:(}{cmd:)}, then the length of the dataset will be expanded accordingly with missing values, even if ynewvar and xnewvar are temporary variables, and a warning message is displayed. {cmd:generate}{cmd:(}{cmd:)} cannot be used with {cmd:by}{cmd:(}{cmd:)}. {p 0 4}{cmdab:a:dd}{cmd:(}{it:yoldvar} {it:xoldvar}{cmd:)} adds an old confidence ellipse to the new confidence ellipse. The result is two overlaid confidence ellipses in the same graph. May be used with but does not require {cmd:generate}{cmd:(}{cmd:)}. {p 0 4}{cmdab:nogr:aph} suppresses the display of the graph. {p 0 4}{cmd:replace} replaces any existing variables in {cmd:generate}{cmd:(}{cmd:)}. {p 0 4}{cmd:evr}{cmd:(}{it:#}{cmd:)} specifies the error variance ratio, where # is a floating point number between 0 and 10^36. The default is 1. evr(0) corresponds to regression of x on y, evr(1) to orthogonal regression, and a larger number, say evr(999), corresponds to regression of y on x. See McCartin (n.d.). {p 0 4}{cmdab:np:oints}{cmd:(}{it:#}{cmd:)} specifies # points to be calculated for the confidence ellipse. The default is 400. You seldom have to use this option, but users with Small Stata may want to lower the number and if the output looks jagged try increasing the number. {p 0 4}{cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] specifies the display format of the y-axis. The default is to use a {cmd:%9.0g} format. {p 4 4}{cmd:yformat} specifies that the y-axis uses the {it:yvar}'s display format. {p 4 4}{cmd:yformat(%}{it:fmt}{cmd:)} specifies the format to be used for the y-axis (see help {help format}). {p 0 4}{cmdab:xf:ormat}{cmd:(%}{it:fmt}{cmd:)} specifies the display format of the x-axis; see the {cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] option above. {p 0 4}{it:graph_options} are any options allowed with {cmd:graph, twoway}, including {cmd:by}{cmd:(}{it:varname}{cmd:)}. {cmd:by}{cmd:(}{cmd:)} is incompatible with {cmd:pool}{cmd:(}{cmd:)}. {cmd:by}{cmd:(}{cmd:)} with many groups may exceed the "width" of the dataset because of the {cmd:stack} included in {cmd:ellip7}. Defaults are: c(l) s(.) t1(" ") t2(" ") l1({it:yvar}) b2({it:xvar}) [or c(ll) s(..) if {cmd:add}{cmd:(}{cmd:)} is specified, etc.], and l1(Estimated {it:yvar}) and b2(Estimated {it:xvar}) if {cmd:coefs} is specified. {title:Remarks} {p 4 4}The latest version for Stata 7 is version 1.3.1 of {cmd:ellip7}. The last version for Stata 6 was version 1.2.0 of {cmd:ellip6}. To use the {cmdab:p:ool}{cmd:(}{it:#}{cmd:)} option, you must have gphdt.ado and gphsave.ado installed. {p 4 4}{cmd:ellip7} is a graphics command, but {cmdab:g:enerate}{cmd:(}{cmd:)} may lengthen the dataset. Only one statistics may be requested in {cmd:constant}{cmd:(}{cmd:)}; A simple but limited workaround is to use {cmd:add}{cmd:(}{cmd:)}. {p 4 4}The {cmd:by}{cmd:(}{it:varname}{cmd:)} graph option bug has been fixed in {cmd:ellip7} but not in {cmd:ellip6}. In {cmd:ellip7}, the graph option now displays an ellipse for each value of {it:varname} in {cmd:by}{cmd:(}{cmd:)}, as expected. {cmd:ellip7} also introduce the nograph option, and the sq argument to constant(). {p 4 4}Stata 8 became available in January 2003. Stata 8 has a new graphics programming language, and many new graphics features. For example, Stata 8 has a new built-in method for overlaying graphs with a ||-separator and a ()-binding notation. To create overlaid confidence ellipses with Stata 7 or Stata 6, I recommend Nick Cox's muxyplot.ado. That is, generate the ellipse variables with the {cmd:gen}{cmd:(}{cmd:)} option, and then use {cmd:muxyplot} {it:yvarlist} {it:xvarlist}. The complementary command muxyplot with helpfile can be downloaded separately from SSC. {p 4 4}Version 1.3.0 from 20030116 had bugs which have been fixed in 1.3.1. That is, by() with coefs would not report results and, more importantly, would incorrectly report the default (means) by-results; this bug does not apply to ellip6 and ellip5, because they are not byable. pool() would only use two independent variables as part of its calculations even if the immediately preceding regress command used more independent variables; the bug still affects ellip6 (the original version of ellip/ellip5 never had the bug, because it was used after fit rather than after regress). {p 4 4}The author is currently developing the program in Stata 8. Please contact the author if you want to contribute in any way. {title:Examples} {p 1 26}{inp:. ellip y x} {space 12} (graph sd ellipse){p_end} {p 1 26}{inp:. ellip y x, g(sdy sdx)} {space 1}(graph and generate sd ellipse){p_end} {p 1 26}{inp:. ellip y x, c(hoteln) a(sdy sdx)}{break} (overlaid graph of 95% Hotelling confidence ellipse and previous sd ellipse) {p_end} {p 1 26}{inp:. reg dv iv}{p_end} {p 1 26}{inp:. ellip iv, coefs c(chisq)}{break} (graphs a 95% Chi-square confidence ellipse around the regression coeffient for iv and around _cons in the preceding regression){p_end} {title:Author} {p 5}Anders Alexandersson {p_end} {p 5}ITS, Mississippi State University{p_end} {p 5}Mississippi State, MS 39762{p_end} {p 5}USA{p_end} {title:References} {p 5 10}Batschelet, E. 1981. Circular Statistics in Biology. London and New York: Academic Press. {p 5 10}Gong, J. 2002. Clarifying the Standard Deviational Ellipse. Geographical Analysis 34(2): 155-167. {p 5 10}Johnson, R., and D. Wichern. 2002. 5th ed. Applied Multivariate Statistical Analysis. Upper Saddle River, NJ: Prentice Hall. {p 5 10}McCartin, B. n.d. A Geometric Characterization of Linear Regression. Statistics: A Journal of Theoretical and Applied Statistics. {title:Also see} Manual: {hi:[R] graph}, {hi:[R] gph} STB: {hi:STB-46 gr32}, {hi:STB-34 gr20} {p 0 19}On-line: help for {help gphsave}, {help gphdt}, {help muxyplot} (if installed){p_end}