help mata blncdtree-------------------------------------------------------------------------------

Title

blncdtree() -- Balanced search tree functions used by somersd

Syntax

real matrixblncdtree(nnode)

void_blncdtree(tree,imin,imax)

tidot=tidottree(x,y[,weight[,xcen[,ycen] ] ])

tidot=tidot(x,y[,weight[,xcen[,ycen] ] ])where

nnode:real scalartree:real matriximin:real scalarimax:real scalartidot:real colvectorx:real colvectory:real colvectorweight:real colvectorxcen:real colvectorycen:real colvector

DescriptionThese functions are used by the

somersdpackage to create balanced binary search trees and to calculatetidotvectors as defined below underRemarks. Applications of these functions are discussed in the filesomersd.pdf, which is distributed with thesomersdpackage.

blncdtree(nnode)returns a balanced binary search tree matrix, representing a balanced binary search tree whose nodes are the positive integer indices from 1 tonnode. The result is defined as annnodex 2 matrix, whoseith row contains, in column 1, the left daughter index of the indexi(or zero ifihas no left daughter) and contains, in column 2, the right daughter index of the indexi(or zero ifihas no right daughter). A balanced binary search tree has the feature that, for any indexi, the numbers of indices in the left and right subtrees ofidiffer by no more than one. The root node of the returned balanced binary search tree matrix is the index given by the Mata expresstiontrunc((1+nnode)/2).

_blncdtree(tree,imin,imax)inputs an existing matrix intreeand reassigns columns 1 and 2 of rowsimintoimaxoftree, so that these rows define a balanced binary search tree whose nodes are the row indicesisuch thatimin<=i<=imax. The root node of this search tree is the index given by the Mata expressiontrunc((imin+imax)/2).blncdtree()works by calling_blncdtree(), which works by assigning daughter indices to the root node and then calling itself recursively to produce left and right subtrees for the root node._blncdtree()is a fast program that performs no conformability or consistency checks, although it will abort if the matrixtreehas fewer than two columns.

tidottree(x,y,weight,xcen,ycen)inputs a vector of X values inxand a parallel vector of Y values iny, and, optionally, other parallel vectors inweight,xcen, andycen, containing, respectively, the weights, the censorship indicators forx, and the censorship indicators fory. A censorship indicator value is interpreted as implying left-censorship if negative, right-censorship if positive, and noncensorship if zero.tidottree()assumes thatweightis a vector of ones and thatxcenandycenare vectors of zeros, if these arguments are not provided by the user.tidottree()returns, as output, a new vectortidot, parallel to the input vectors, which will contain the weighted sums of concordance-discordance differences corresponding to the input vectors, assuming that the data matrix defined by these input vectors is sorted in ascending order ofx.tidottree()performs no checks that the input data matrix is indeed sorted byx. However,tidottree()uses a search tree algorithm, which calculates thetidotvalues in a time proportional toN*log(N). It creates the search tree by callingblncdtree().

tidot(x,y,weight,xcen,ycen)inputs the same vectors astidottree()and returns the same output vectortidot, which in this case contains concordance-discordance totals whether or not the input data matrix is sorted. However,tidot()does not use the search tree algorithm and therefore requires a time proportional to the square ofN.tidot()is therefore less efficient thantidottree()for large datasets, even after accounting for the time taken to sort the dataset before callingtidottree().

RemarksThe vector

tidotcalculated bytidottreeandtidotis defined as follows. For scalarsu,p,v, andq, define the functioncsign(

u,p,v,q)as 1 if

u>vandp>=0>=q, -1 ifu<vandp<=0<=q, and 0 otherwise. For row indicesiandj, define

T_ij= csign(x_i,xcen_i,x_j,xcen_j) * csign(y_i,ycen_i,y_j,ycen_j)as the concordance-discordance difference between rows

iandjof the input data vectors. Then the vectortidot, returned bytidottreeortidot, is defined by

Ntidot_i= Sumweight_j*T_ijj=1for each index

i. The functiontidot()calculatestidotby calculating all theT_ijand summing them, taking a time asymptotically proportional to the square ofN. The functiontidottree()uses a search tree algorithm to calculatetidotin a time asymptotically proportional toN*log(N). The algorithm used bytidottree()is a more advanced version of the one presented in Newson (2006) and assumes that the input data vectors are sorted in ascending order of the values ofx.tidottree()callsblncdtree()to create a search tree with one row for each value ofyand uses this search tree to calculatetidotwithout calculating the individualT_ij.Applications of these functions are discussed in the manual

somersd.pdf, which is distributed with thesomersdpackage as an ancillary file. Balanced binary search trees are discussed in Wirth (1976).

Conformability

blncdtree(nnode):nnode: 1x1

_blncdtree(tree,imin,imax):tree:M x KwhereK>= 2imin: 1x1imax: 1x1

tidottree(x,y,weight,xcen,ycen):x:N x1y:N x1weight:N x1xcen:N x1ycen:N x1

tidot(x,y,weight,xcen,ycen):x:N x1y:N x1weight:N x1xcen:N x1ycen:N x1

Diagnostics

blncdtree(nnode)aborts with error ifnnode< 0.

_blncdtree(tree,imin,imax)aborts with error ifcols(tree)< 2.

blncdtree(0)can return a 0x2 result.

Source codeblncdtree.mata, _blncdtree.mata, tidottree.mata, tidot.mata

AuthorRoger Newson, Imperial College London, UK. Email: r.newson@imperial.ac.uk

ReferencesNewson, R. 2006. Efficient calculation of jackknife confidence intervals for rank statistics.

Journal of Statistical Software15: 1-10.Wirth, N. 1976.

Algorithms + Data Structures = Programs. Englewood Cliffs, NJ: Prentice-Hall.

Also seeManual:

[M-0] introOnline:

mata,somersd(if installed)