{smcl} {.-} help for {cmd:normalbvr} {right:(Roger Newson)} {.-} {title:Generate Normal bivariate ridits} {p 8 27} {cmd:normalbvr} {it:newvarname} [{cmd:if} {it:exp}] [{cmd:in} {it:range}] , {cmdab:x}{cmd:(}{it:expression_1}{cmd:)} {cmdab:y}{cmd:(}{it:expression_2}{cmd:)} [ {cmdab:r:ho}{cmd:(}{it:expression_3}{cmd:)} {cmdab:mux}{cmd:(}{it:expression_4}{cmd:)} {cmdab:muy}{cmd:(}{it:expression_5}{cmd:)} {cmdab:sdx}{cmd:(}{it:expression_6}{cmd:)} {cmdab:sdy}{cmd:(}{it:expression_7}{cmd:)} {cmd:float} ] {pstd} where {it:expressioni} (for {it:i} from 1 to 7) is a numeric expression. The numeric expression for each option must be in the form required by the {cmd:generate} command. That is to say, each expression must be specified so that the command {pstd} {cmd:gene double }{it:newvarname}{cmd:=(}{it:expressioni}{cmd:)} {pstd} will work. {title:Description} {pstd} {cmd:normalbvr} inputs expressions specifying an {it:X}-variable and a {it:Y}-variable and parameters for a bivariate normal distribution, and generates a new variable, containing bivariate ridits of the {it:X}-variable and the {it:Y}-variable with respect to the specified bivariate Normal distribution. Normal bivariate ridits are used in power calcuations for Kendall's tau-a. If the {it:X}-variable and the {it:Y}-variable have the specified bivariate Normal distribution, then the mean bivariate ridit is equal to the population Kendall's tau-a, and the sampling variance of the sample Kendal's tau-a is equal to 4 times the variance of the bivariate ridits divided by the sample number. {title:Options} {p 0 4}{cmd:x(}{it:expression_1}{cmd:)} must be specified. It gives an expression for the {it:X}-variable. {p 0 4}{cmd:y(}{it:expression_2}{cmd:)} must be specified. It gives an expression for the {it:Y}-variable. {p 0 4}{cmd:rho(}{it:expression_3}{cmd:)} gives an expression for the correlation coefficient of the specified bivariate Normal distribution. If not specified, it is set to zero. {p 0 4}{cmd:mux(}{it:expression_4}{cmd:)} gives an expression for the mean of the {it:X}-variable in the specified bivariate Normal distribution. If not specified, it is set to zero. {p 0 4}{cmd:muy(}{it:expression_5}{cmd:)} gives an expression for the mean of the {it:Y}-variable in the specified bivariate Normal distribution. If not specified, it is set to zero. {p 0 4}{cmd:sdx(}{it:expression_6}{cmd:)} gives an expression for the standard deviation of the {it:X}-variable in the specified bivariate Normal distribution. If not specified, it is set to 1. {p 0 4}{cmd:sdy(}{it:expression_7}{cmd:)} gives an expression for the standard deviation of the {it:Y}-variable in the specified bivariate Normal distribution. If not specified, it is set to 1. {p 0 4}{cmd:float} specifies that the output variable will have a {help datatypes:storage type} no higher than {hi:float}. If {cmd:float} is not specified, then {cmd:normalbvr} creates the output variable with storage type {hi:double}. Whether or not {cmd:float} is specified, {cmd:normlbvr} compresses the output variable as much as possible without loss of precision. (See help for {help compress}.) {title:Methods and Formulas} {pstd} The bivariate ridit of an {it:x}-value and a {it:y}-value, with respect to a bivariate distribution for a bivariate random variable {it:(X,Y)}, is equal to {pstd} {it:B_XY(x,y) = E[sign(x-X)*sign(y-Y)]} {pstd} or (equivalently) to the difference between the probability that a random value of {it:(X,Y)} is concordant with {it:(x,y)} and the probability that a random value of {it:(X,Y)} is discordant with {it:(x,y)}. The expectation and variance of the variable {it:B_XY(X,Y)} are used in power calculations for Kendall's tau-a. The population Kendall's tau-a is equal to the population mean of {it:B_XY(X,Y)}, and the standard deviation of the influence function of Kendall's tau-a is equal to twice the population standard deviation of {it:B_XY(X,Y)}. The standard deviation of the influence function of a sample statistic can be divided by the square root of the sample number to obtain the asymptotic standard error of the sample statistic. {pstd} Note that bivariate ridits are defined on a scale fron -1 to 1, by analogy with the univariate ridits defined by Brockett and Levene (1977). For more about univariate ridits, see the help for the {help ssc:SSC} package {helpb wridit}. {pstd} For more about estimating Kendall's tau-a in Stata, using the {help sc:SSC} package {helpb somersd}, see Newson (2006). For more about generalized power calculations using standard deviations of influence functions, and using the {help sc:SSC} package {helpb powercal}, see Newson (2004). For more about the distribution theory of {it:U}-statistics (such as Kendall's tau-a), see Section 3.2 of Puri and Sen (1971). For the application of this theory to power calculations for Kendall's tau-a, see Newson (2018). {title:Examples} {pstd} The first example generates 10000 observations, each with a pair of values sampled from a standard bivariate Normal distribution with a Pearson correlation cofficient of 0.5. We then use {cmd:normalbvr} to compute the Normal bivariate ridits in a new variable {cmd:bvridit}. Finally, we use {helpb collapse} to collapse the dataset to 1 observation, containing the mean of {cmd:bvridit} in the variable {cmd:taua}, which is an estimate for Kendall's tau-a, and the standard deviation of the influence function in a variable {cmd:sdinf}, and list this summary datset. {p 8 16}{inp:. clear}{p_end} {p 8 16}{inp:. set seed 98765432}{p_end} {p 8 16}{inp:. set obs 10000}{p_end} {p 8 16}{inp:. scal rhoscal = 0.5}{p_end} {p 8 16}{inp:. gene xvar = rnormal()}{p_end} {p 8 16}{inp:. gene yvar = rhoscal*xvar + rnormal()*sqrt(1-rhoscal*rhoscal)}{p_end} {p 8 16}{inp:. normalbvr bvridit, x(xvar) y(yvar) rho(rhoscal)}{p_end} {p 8 16}{inp:. collapse (count) N=bvridit (mean) taua=bvridit (sd) sdinf=bvridit}{p_end} {p 8 16}{inp:. replace sdinf=2*sdinf}{p_end} {p 8 16}{inp:. list, abbr(32)}{p_end} {pstd} The second example first creates a dataset with 1 observation for each of a sequence of 13 Pearson correlations ranging from -1 to 1 by increments of 1/12, stored in a variable {cmd:rhovar}. We then use the {help ssc:SSC} package {helpb expgen} to create an expanded dataset, in which each observation in the original dataset is replaced by 10000 observations, and simulate, in each observation, a pair of values in variables {cmd:xvar} and {cmd:yvar}, sampled from a bivariate standard Normal distribution with the Pearson correlation stored in {cmd:rhovar}. After this, we use {cmd:normalbvr} to compute the Normal bivariate ridits, and {helpb collapse} the dataset to create a dataset with 1 observation per Pearson correlation, and variables {cmd:taua} estimating the corresponding Kendall's tau-a and {cmd:sdinf} estimatind the standard deviation of the influence function for estimatin Kendall's tau-a. This dataset, created using simulation, is listed, and illustrates Greiner's relation between Kendall's tau-a and the Pearson correlation, which is given by {pstd} {it: taua = (2/_pi)*asin(rho)} {pstd} and which holds for a bivariate Normal distribution. {p 8 16}{inp:. clear}{p_end} {p 8 16}{inp:. set seed 98765432}{p_end} {p 8 16}{inp:. set obs 25}{p_end} {p 8 16}{inp:. gene rhovar=(_n-13)/12}{p_end} {p 8 16}{inp:. sort rhovar}{p_end} {p 8 16}{inp:. expgen =10000, sortedby(unique) copyseq(xyseq)}{p_end} {p 8 16}{inp:. gene xvar = rnormal()}{p_end} {p 8 16}{inp:. gene yvar = rhovar*xvar + rnormal()*sqrt(1-rhovar*rhovar)}{p_end} {p 8 16}{inp:. normalbvr bvridit, x(xvar) y(yvar) rho(rhovar)}{p_end} {p 8 16}{inp:. collapse (count) N=bvridit (mean) taua=bvridit (sd) sdinf=bvridit, by(rhovar)}{p_end} {p 8 16}{inp:. replace sdinf=2*sdinf}{p_end} {p 8 16}{inp:. list, abbr(32)}{p_end} {pstd} Alternatively, we might use a grid of Normal percentiles, instead of sampling by Monte Carlo simulation. In either case, we are numerically integrating values for Kendall's tau-a and the standard deviation of its influence function. These can then be used in power calculations by the {help ssc:SSC} package {helpb powercal}. {title:Author} {pstd} Roger Newson, Imperial College London, UK. Email: {browse "mailto:r.newson@imperial.ac.uk":r.newson@imperial.ac.uk} {title:References} {phang} Brockett, P. L., and Levene, A. 1977. On a characterization of ridits. {it:The Annals of Statistics} 5(6): 1245-1248. {pstd} Newson R. 2018. Bivariate ridits and distribution theory for Kendall's tau-a. Download from {browse "http://www.rogernewsonresources.org.uk/papers.htm#miscellaneous_documents":Roger Newson's website}. {pstd} Newson R. 2006. Confidence intervals for rank statistics: Somers' {it:D} and extensions. {it:The Stata Journal} 6(3): 309-334. Download from {browse "http://www.stata-journal.com/article.html?article=snp15_6":The Stata Journal website}. {pstd} Newson R. 2004. Generalized power calculations for generalized linear models and more. {it:The Stata Journal} 4(4): 379-401. Download from {browse "http://www.stata-journal.com/article.html?article=st0074":The Stata Journal website}. {pstd} Puri M. L., and Sen P. K. 1971. Nonparametric Methods in Multivariate Statistics. New York: John Wiley & Sons Inc. {title:Also see} {p 4 13 2} {bind: }Manual: {hi:[R] collapse} {p_end} {p 4 13 2} On-line: help for {helpb collapse}{break} help for {helpb powercal}, {helpb somersd}, {helpb expgen}, {helpb wridit} (if installed)