Iteration options used by censlope
Syntax
Percentile slope estimation
censlope ... [, options]
Set default maximum iterations
set maxiter # [, permanently]
options Description ------------------------------------------------------------------------- fromabs(#) initial estimate for absolute magnitude of slopes brackets(#) maximum number of rows for the bracket matrix technique(algorithm_spec) iterative numerical solution technique iterate(#) perform maximum of # iterations; default is iterate(16000) tolerance(#) tolerance for the percentile slopes log display an iteration log of the brackets during bracket convergence nolimits do not calculate confidence limits ------------------------------------------------------------------------- where algorithm_spec is
algorithm [ # [ algorithm [#] ] ... ]
and algorithm is { bisect | regula | ridders }
Description
The censlope command calculates estimates and confidence limits for a median or other percentile slope beta by solving numerically a scalar equation in beta, using an iterative method. The options controlling the exact iterative method will probably not be used often, because censlope is intended to have sensible defaults. However, users who wish to change the default method may do so, using a set of options similar to the maximization options used by Stata maximum likelihood estimation commands.
set maxiter specifies the default maximum number of iterations for estimation commands that iterate. The initial value is 16000, and # can be 0 to 16000. To change the maximum number of iterations performed by a particular estimation command, you need not reset maxiter; you can specify the iterate(#) option. When iterate(#) is not specified, the maxiter value is used.
Iteration options
fromabs(#) specifies an initial estimate of the typical absolute magnitude of a percentile slope. If fromabs() is not specified, it defaults to the aspect ratio (ymax-ymin)/(xmax-xmin) (where xmax and xmin are the maximum and minimum X values, and ymax and ymin are the maximum and minimum Y values) if that ratio is defined and nonzero, and to 1 otherwise. This magnitude is used in constructing the bracket matrix. Candidate bracket beta-values will have values of 0 or of +/-fromabs*2^K, where K is a nonnegative integer. The bracket matrix is a matrix with two columns and three or more rows, each row containing a candidate beta-value in column 1 and the corresponding zetastar-value in column 2. It is used to find an initial pair of beta-values for input into the iterative numerical solution method, which attempts to find a solution in beta between the two initial beta-values. The bracket matrix is initialized to have beta-values -fromabs, 0 and +fromabs, and zetastar-values corresponding to these beta-values. If a target zeta-value is outside the range of the zetastar-values of the bracket matrix, then the bracket matrix is extended by adding new rows before the first row by successively doubling the beta-value in the first row, or by adding new rows after the last row by successively doubling the beta-value in the last row, until there is a zetastar-value in the second column on each side of the target zeta-value. For an explanation of this terminology, see Methods below.
brackets(#) specifies a maximum number of rows for the bracket matrix. The minimum is brackets(3). The default is brackets(1000).
technique(algorithm_spec) specifies an iterative solution method for finding a solution in beta to the equation to be solved. The following algorithms are currently implemented in censlope.
technique(bisect) specifies an adapted version of the bisection method for step functions.
technique(regula) specifies an adapted version of the regula falsi (or false position) method for step functions.
technique(ridders) specifies an adapted version of the method of Ridders (1979) for step functions.
The default is technique(ridders 5 bisect iterate), where iterate is the value of the iterate() option. The bisection method is guaranteed to converge in a number of iterations similar to the binary logarithm of the tolerance() option. The regula falsi and Ridders methods are usually faster if the zetastar function is very nearly continuous but may sometimes be slower if the zetastar function is a discrete step function. All methods are modified versions, for step functions, of the methods of the same names described in Press et al. (1992).
You can switch between algorithms by specifying more than one in the technique() option. By default, censlope will use an algorithm for five iterations before switching to the next algorithm. To specify a different number of iterations, include the number after the technique in the option. For example, specifying technique(ridders 10 bisect 1000) requests that censlope perform 10 iterations by using the Ridders algorithm, perform 1000 iterations by using the bisection algorithm, and then switch back to Ridders for 10 iterations, and so on. The process continues until convergence or until the maximum number of iterations is reached.
iterate(#) specifies the maximum number of iterations. When the number of iterations equals iterate(), the iterative solution program stops and records failure to converge. If convergence is declared before this threshold is reached, it will stop when convergence is declared. The default value of iterate(#) is the current value of set maxiter, which is iterate(16000) by default.
tolerance(#) specifies the tolerance for the percentile differences. When the relative difference between the current beta-brackets is less than or equal to tolerance(), the tolerance() convergence criterion is satisfied. tolerance(1e-6) is the default.
log specifies that an iteration log showing the progress of the numerical solution method is to be displayed. If an iteration log is displayed, then there will be four separate iteration sequences per percentile, estimating the left estimate, the right estimate, the lower confidence limit, and the upper confidence limit, respectively. For this reason, the default is not to produce an iteration log. However, if censlope is expected to be slow (as for large datasets), then an iteration log can be specified to reassure the user that progress is being made.
nolimits specifies that lower and upper confidence limits will not be calculated. This will save computational time if the user plans to calculate confidence limits using a resampling prefix command, such as bootstrap or jackknife.
Methods
The program censlope uses iterative numerical methods to solve an equation involving a monotonically nonincreasing step function. Given two variables Y and X, we attempt to solve in beta an equation
zetastar(beta) - zetatarget = 0
where we define
zetastar(beta) = zeta( theta( Y - beta*X , X ) )
where theta(U,V) may be either D(U|V) (Somers' D) or tau_a(U,V) (Kendall's tau-a) as defined in somersd, zeta() is any one of the transformations used by the transf() option of somersd, and zetatarget is a target zeta-value. This target zeta-value may be zeta(1-2q) if we are trying to estimate the 100qth percentile slope or zeta(1-2q) +/- multiplier*standard_error if we are trying to calculate confidence limits for the 100qth percentile slope. In either case, the left-hand side of the equation to be solved is a step function, and therefore a unique solution may not exist, as there may be no exact solution or an interval of exact solutions. We therefore have to find either a left solution (defined as the supremum value of beta such that zetastar(beta)-zetatarget is positive), or a right solution (defined as the infimum value of beta such that zetastar(beta)-zetatarget is negative). For a given 100qth percentile slope, the estimate of the percentile is defined as the mean of the left and right solutions, the lower confidence limit is defined as a left solution, and the upper confidence limit is defined as a right solution. Therefore, each percentile slope to be estimated requires four iteration sequences, solving the left estimate, the right estimate, the lower confidence limit, and the upper confidence limit, respectively. This requirement implies many evaluations of the object function zetastar(beta)-zetatarget, each of which in turn involves calculating Somers' D or Kendall's tau-a. It is therefore important to be able to calculate Somers' D and/or Kendall's tau-a efficiently, if censlope is to be used for large samples. Fortunately, somersd uses the algorithm of Newson (2006), which calculates Somers' D and/or Kendall's tau-a in a time of order N*log(N), where N is the number of observations.
Initial beta-value estimates for the iterative methods are stored in the bracket matrix, which is a matrix with two columns. The first column contains a sequence of beta-values in ascending order, and the second column contains the corresponding zetastar-values. As the function zetastar() is nonincreasing, it follows that a pair of zetastar-values from the bracket matrix, one on each side of the target zetastar-value, will correspond to a pair of beta-values, one on each side of the left or right solution to the equation in beta. The bracket matrix is initialized, and extended if necessary, as described in the help for the fromabs() and brackets() options.
The methods used by censlope and somersd are implemented using a library of Mata functions, whose code is distributed with the somersd package in accordance with the open-source principle.
Author
Roger Newson, Imperial College London, UK. Email: r.newson@imperial.ac.uk
References
Newson, R. 2006. Efficient calculation of jackknife confidence intervals for rank statistics. Journal of Statistical Software 15: 1-10.
Press, W. H., S. A., Teukolsky, W. T. Vetterling, and B. P. Flannery. 1992. Numerical Recipes in C: The Art of Scientific Computing. 2nd ed. Cambridge, UK: Cambridge University Press.
Ridders, C. J. F. 1979. A new algorithm for computing a single root of a real continuous function. IEEE Transactions on Circuits and Systems, vol. CAS-26(11): 979-980.
Also see
Manual: [R] maximize
Online: lrtest, ml, test, somersd, censlope, cendif (if installed)