help mata mm_jk()


mm_jk() -- Jackknife estimation


jk = mm_jk(f, X [, w, nodots, strata, cluster, subpop, fpcvar, stat, ...])


f: pointer scalar containing address of function to be bootstrapped, i.e. f = &functionname() X: real matrix containing data (rows are observations, columns variables) w: real colvector containing weights nodots: real scalar indicating that replication dots be suppressed strata: real colvector containing (ordered) strata ID variable cluster: real colvector containing (ordered) cluster ID variable subpop: real colvector containing subpopulation identifier fpcvar: real colvector containing sampling fractions for finite population correction stat: real rowvector containing the results of f using the original data, i.e., the "observed" value of f ...: up to 10 optional arguments to pass through to f

real matrix mm_jk_report(jk [, what, level, mse, fpc])


what: string vector containing statistics to be reported, where the available statistics are: "b" or "theta" ("observed" value), "pseudo" (jackknife pseudovalues), "mean" (jackknife mean), "bias" (jackknife mean - observed value), "v" (variance-covariance matrix), "se" (standard error; the default), "ci" (confidence interval) level: real scalar containing the confidence level for confidence intervals (default is 95 or as set by set level) mse: real scalar indicating that the mean squared errors formula be used fpc: real vector containing sampling fractions for finite population correction

jk is a variable used for communication between mm_jk() and mm_jk_report(). If you declare jk, declare it to be transmorphic.


mm_jk(f, X, w) computes the jackknife (leave-one-out) replicates of function f applied to the data X (and weights w; omit w, or specify w as 1 to obtain unweighted results) and returns the results as a structure. To be precise, f is a pointer to a function, i.e. f = &functionname(), e.g. f = &mean() (see [M-2] ftof). mm_jk() expects function f to return a real rowvector of parameter estimates to be jackknifed. Furthermore, function f must take the data as the first argument and weights as the second argument. Note that mm_jk() leaves out observations by setting their weights to zero. Make sure that function f properly handles these observations. The results from f should be the same as if the observation would have been deleted.

nodots!=0 indicates that replication dots be suppressed. By default, a single dot character is displayed for each successful replication and a single red 'x' is displayed for each unsuccessful replication. A replication is considered unsuccessful if the replication result contains one or more missing values. mm_jk() only returns results from successful replications.

strata and cluster may be used to specify a strata ID variable and a cluster ID variable. mm_jk() will then produce jackknife results for stratified and clustered data. Note that mm_jk() does not sort the data: A new stratum begins each time strata changes from one row to the next, a new cluster begins each time cluster (or strata) changes from one row to the next. Omit strata or specify strata=. if the sample is unstratified; omit cluster or specify cluster=. if the sample does not contain clusters.

subpop specifies that estimates be computed for the subpopulation for which subpop!=0. Providing only the data for which subpop!=0 and omitting subpop produces different results than providing all data and specifying subpop. See [SVY] subpopulation estimation for information. Omit subpop or specify subpop=. if the estimates be based on all observations.

fpcvar provides sampling fractions to be stored with the jackknife replicates. The stored sampling rates are then used by mm_jk_report() for finite population correction of variance estimates. Note that fpcvar should contain the same number of rows as x and is assumed to be constant within each stratum. You may specify fpcvar=. to set the sampling fractions to zero (and, therefore, omit finite population correction).

By default, mm_jk() first applies f to the original data to obtain the "observed" value of f given X and w. Alternatively, the "observed" value may be provided as stat, where stat is a real rowvector of point estimates. Omit stat or specify stat=. if you do not want to provide the "observed" value.

mm_jk_report() is used to analyze the jackknife replicates computed by mm_jk(). It returns a matrix of statistics such as jackknife means, jackknife standard errors, or jackknife confidence intervals (see the what argument above). Multiple statistics are arranged beneath one another in the specified order. For example, mm_jk_report(jk, ("b","se","ci")) will return the observed values in the first row, the standard errors in the second row, and the lower and upper bounds of the confidence intervals in the third and forth row.

level specifies the confidence level, as a percentage, for confidence intervals. The default is level=95 or as set by set level.

mse!=0 indicates that variances and standard errors be computed using deviations of the replicates from the "observed" value. By default, variances and standard errors are computed using deviations of the pseudovalues from their mean.

fpc may be used to provide sampling fractions for finite population correction of variance estimates. The length of fpc should equal the number of strata. fpc can be omitted if fpcvar was provided to mm_jk().


The following example illustrates the basic usage of mm_jk() and mm_jk_report():

: x = uniform(75,2) : J = mm_jk(&mean(), x, 1) Jackknife replications (75) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .........................

: mm_jk_report(J, ("b", "se")) 1 2 +-----------------------------+ 1 | .4512322009 .4951004972 | 2 | .032499494 .0333034185 | +-----------------------------+

: mm_jk_report(J, "ci") 1 2 +-----------------------------+ 1 | .3864755455 .4287419872 | 2 | .5159888562 .5614590071 | +-----------------------------+

mm_jk() first produces the 75 jackknife leave-one-out estimates of the means of the two variables contained in x. mm_jk_report() then reports the "observed" values, i.e. the means of the two variables in x (first row) and the jackknife standard errors of the means (second row). The second call of mm_jk_report() displays the 95% jackknife confidence intervals for the two means (lower bound in first row, upper bound in second row).

Methods and formulas are as described in [R] jackknife and [SVY] variance estimation. Delete-k jackknife is not supported.


mm_jk(f, X, w, nodots, strata, cluster, subpop, fpcvar, stat, ...): f: 1 x 1 X: n x k w: n x 1 or 1 x 1 nodots: 1 x 1 strata: n x 1 or strata=. cluster: n x 1 or cluster=. subpop: n x 1 or subpop=. fpcvar: n x 1 or fpcvar=. stat: 1 x p or 2 x p or stat=. ...: (depending on f) result: struct mm_jkstats

(*f)(X, w, ...): X: n x k w: n x 1 or 1 x 1 ...: (depending on f) result: 1 x p or 2 x p

mm_jk_report(jk, what, level, mse, fpc): bs: struct mm_jkstats what: s x 1 or 1 x s level: 1 x 1 mse: 1 x 1 fpc: m x 1 or fpc=. (m: n. of strata) result: r x p



Source code



Ben Jann, ETH Zurich,

Also see

Online: help for jackknife, svy jackknife, mm_bs(), moremata