{smcl}
{* 10jul2006}{...}
{cmd:help mata mm_bs()}
{hline}

{title:Title}

{pstd}
{bf:mm_bs() -- Bootstrap estimation}


{title:Syntax}

{p 11 18 2}
{it:bs} = {cmd:mm_bs(}{it:f}{cmd:,} {it:X} [{cmd:,}
{it:w}{cmd:,}
{it:reps}{cmd:,}
{it:d}{cmd:,}
{it:nodots}{cmd:,}
{it:strata}{cmd:,}
{it:cluster}{cmd:,}
{it:stat}{cmd:,}
{it:...}]{cmd:)}

{p 11 18 2}
{it:bs} = {cmd:mm_bs2(}{it:f}{cmd:,} {it:X} [{cmd:,}
{it:w}{cmd:,}
{it:reps}{cmd:,}
{it:d}{cmd:,}
{it:nodots}{cmd:,}
{it:strata}{cmd:,}
{it:cluster}{cmd:,}
{it:stat}{cmd:,}
{it:...}]{cmd:)}

{p 4 4 2}
where

{p 12 18 2}
  {it:f}:  {it:pointer scalar} containing address of function to be
  bootstrapped, i.e. {it:f} = {cmd:&}{it:functionname}{cmd:()}
  {p_end}
{p 12 18 2}
  {it:X}:  {it:real matrix} containing data (rows are observations,
  columns variables)
  {p_end}
{p 12 18 2}
  {it:w}:  {it:real colvector} containing weights
  {p_end}
{p 9 18 2}
  {it:reps}:  {it:real scalar} specifying number of replications
  (default: 50)
  {p_end}
{p 12 18 2}
  {it:d}:  {it:real scalar} specifying reduction of bootstrap sample size
  (default: 0)
  {p_end}
{p 7 18 2}
  {it:nodots}:  {it:real scalar} indicating that replication dots be
  suppressed
  {p_end}
{p 7 18 2}
  {it:strata}:  {it:real colvector} containing (sorted) strata ID variable
  {p_end}
{p 6 18 2}
  {it:cluster}:  {it:real colvector} containing (sorted) cluster ID variable
  {p_end}
{p 9 18 2}
  {it:stat}:  {it:real matrix} containing the results of {it:f}
  using the original data, i.e., the "observed" value of {it:f}
  {p_end}
{p 10 18 2}
  {it:...}:  up to 10 optional arguments to pass through to {it:f}
  {p_end}


{p 4 18 2}
{it:real matrix} {cmd:mm_bs_report(}{it:bs} [{cmd:,}
{it:what}{com:,}
{it:level}{cmd:,}
{it:mse}{cmd:,}
{it:jk}]{cmd:)}

{p 4 4 2}
where

{p 9 18 2}
  {it:what}:  {it:string vector} containing statistics to be
  reported, where the available statistics are: {cmd:"b"}
or {cmd:"theta"} ("observed" value),
{cmd:"mean"} (bootstrap mean),
{cmd:"bias"} (bootstrap mean - observed value),
{cmd:"v"} (variance-covariance matrix),
{cmd:"se"} (standard error; the default),
{cmd:"ci"} or {cmd:"n"}  (normal-approximation confidence interval),
{cmd:"basic"} (basic confidence interval),
{cmd:"p"} (percentile confidence interval),
{cmd:"bc"} (bias-corrected confidence interval),
{cmd:"bca"} (bias-corrected and accelerated confidence interval),
{cmd:"t"} (percentile-t confidence interval){p_end}
{p 8 18 2}
  {it:level}:  {it:real scalar} containing the confidence level
  for confidence intervals (default is 95 or as
  set by {helpb set level})
  {p_end}
{p 10 18 2}
  {it:mse}:  {it:real scalar} indicating that the mean squared
  errors formula be used
  {p_end}
{p 11 18 2}
  {it:jk}:  {it:struct mm_jkstats} containing results from
  {helpb mf_mm_jk:mm_jk()} (required if {it:what} contains
  {cmd:"bca"})
  {p_end}

{pstd}{it:bs} is a variable used for communication between
{cmd:mm_bs()} and {cmd:mm_bs_report()}. If you declare {it:bs},
declare it to be {it:transmorphic}.


{title:Description}

{pstd}
{cmd:mm_bs(}{it:f}{cmd:,} {it:X}{cmd:,} {it:w}{cmd:)}
applies function {it:f} to bootstrap samples of the
data {it:X} (and weights {it:w}) and returns the results as a
structure. To be precise, {it:f} is a pointer to a function,
i.e. {bind:{it:f} = {cmd:&}{it:functionname}{cmd:()}}, e.g.
{bind:{it:f} = {cmd:&mean()}} (see {helpb m2_ftof:[M-2] ftof}).
{cmd:mm_bs()} expects function {it:f} to return a
{it:real rowvector} of parameter estimates to be bootstrapped or,
optionally, a {it:real matrix} containing parameter estimates in the first row
and associated standard errors in the second row (the standard
errors are required for percentile-t confidence intervals; see
Remarks below). Furthermore, function {it:f} must take the data as the
first argument and weights as the second argument.

{pstd}
Note that the weights {it:w} are not relevant for the bootstrap
resampling process. That is, {cmd:mm_bs()} always draws simple
(i.e. equal probability) random samples (with replacement),
no matter whether weights are
specified or not. However, weights are passed through
to the internal calls of function {it:f}. Omit {it:w}, or specify
{it:w} as 1 to obtain unweighted results. {it:w}=1 is passed to
function {it:f} if {it:w} is omitted.

{pstd}{it:reps} specifies the number of desired bootstrap replicates.
The default is 50, which is too low for most applications. The default
sample size for the single bootstrap samples is the number of
observations in the data (or number of clusters, if {it:cluster} is
specified). However, {it:d}>0 causes the default bootstrap sample
size to be reduced by {it:d} (within each stratum if {it:strata}
is specified). For example, specify {it:d}=1 to produce
bootstrap samples containing only N-1 observations. Specify, {it:d}=0
to not change the default sample size.

{pstd}{it:nodots}!=0 indicates that replication dots be suppressed.
By default, a single dot character is displayed for each successful replication and
a single red 'x' is displayed for each unsuccessful replication. A
replication is considered unsuccessful if the replication result
contains one or more missing values. {cmd:mm_bs()} only returns
results from successful replications.

{pstd}{it:strata} and {it:cluster} may be used to specify a strata ID
variable and a cluster ID variable. {cmd:mm_bs()} will then draw
stratified samples of clusters. Note that {cmd:mm_bs()} does not
sort the data: A new stratum begins each time
{it:strata} changes from one row to the next, a new cluster begins
each time {it:cluster} (or {it:strata}) changes from one row to the
next. Omit {it:strata} or specify {it:strata}=. if the sample is
unstratified; omit {it:cluster} or specify
{it:cluster}=. if the sample does not contain clusters.

{pstd}By default, {cmd:mm_bs()} first applies {it:f} to the original
data to obtain the "observed" value of {it:f} given {it:X} and
{it:w}. Alternatively, the "observed" value may be provided as
{it:stat}, where {it:stat} is a {it: real matrix} containing point estimates
in the first row and, optionally, associated standard errors in the
second row. Omit {it:stat} or specify {it:stat}=. if you do not want to provide
the "observed" value.

{pstd}{cmd:mm_bs2()} is an alternative version of {cmd:mm_bs()}. Instead of
physically sampling the data, {cmd:mm_bs2()}
implements bootstrap estimation by multiplying {it:w}
by the number of times an observation belongs to the bootstrap sample. {cmd:mm_bs2()} requires less memory than
{cmd:mm_bs()} but it is slower and cannot be used in all situations (i.e. only
if, for the statistic in question, multiplying {it:w} is a correct way to
represent multiple drawn observations).

{pstd}The results produced by {cmd:mm_bs()} and {cmd:mm_bs2()} depend
on the the initial value of random-number seed. Use {helpb set seed}
or {helpb mf_uniformseed:uniformseed()} to set the seed.

{pstd}{cmd:mm_bs_report()} is used to analyze the bootstrap
replications computed by {cmd:mm_bs()} or {cmd:mm_bs2()}. It
returns a matrix of statistics such as
bootstrap means, bootstrap standard errors, or various versions
of bootstrap confidence intervals (see the {it:what} argument above).
Multiple statistics are arranged beneath one another in the specified
order. For example, {cmd:mm_bs_report(}{it:bs}{cmd:, ("b","se","ci"))} will
return the observed values in the first row, the standard errors in the
second row, and the lower and upper bounds of the normal-approximation
confidence intervals in the third and forth row.

{pstd}
{it:level} specifies the confidence level, as a percentage, for confidence
intervals.  The default is {it:level}=95 or as set by
{helpb set level}.

{pstd} {it:mse}!=0 indicates that variances and standard errors be
computed using deviations of the replicates from the "observed"
value. By default, variances and standard errors are computed
using deviations from the average of the replicates.

{pstd}
{it:jk} provides jackknife statistics as returned by
{helpb mf_mm_jk:mm_jk()}. {it:jk} is required for bias-corrected and
accelerated confidence intervals ({cmd:"bca"}). Omit {it:jk} or
specify {it:jk}=. if no bias-corrected and
accelerated confidence intervals are computed.


{title:Remarks}

{pstd}Remarks are presented under the headings

{phang2}{it:{help mf_mm_bs##r1:Introduction}}{p_end}
{phang2}{it:{help mf_mm_bs##r2:BCa confidence intervals}}{p_end}
{phang2}{it:{help mf_mm_bs##r3:Percentile-t confidence intervals}}{p_end}
{phang2}{it:{help mf_mm_bs##r4:Methods and formulas}}{p_end}


{marker r1}{pstd}{ul:{it:Introduction}}

{pstd}The following example illustrates the basic usage of
{cmd:mm_bs()} and {cmd:mm_bs_report()}:

        {com}: x = uniform(75,2)
        {res}
        {com}: B = mm_bs(&mean(), x, 1, 200)
        {res}{txt}
        Bootstrap replications ({res}200{txt})
        {txt}{hline 4}{c +}{hline 3} 1 {hline 3}{c +}{hline 3} 2 {hline 3}{c +}{hline 3} 3 {hline 3}{c +}{hline 3} 4 {hline 3}{c +}{hline 3} 5
        ..................................................    50
        ..................................................   100
        ..................................................   150
        ..................................................   200

        {com}: mm_bs_report(B, ("b", "se"))
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.5059552237   .4677170853{txt}  {c |}
          2 {c |}  {res}.0307029664   .0333863864{txt}  {c |}
            {c BLC}{hline 29}{c BRC}

        {com}: mm_bs_report(B, "ci")
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.4454103082   .4018805821{txt}  {c |}
          2 {c |}  {res}.5665001392   .5335535885{txt}  {c |}
            {c BLC}{hline 29}{c BRC}{txt}

{pstd}{cmd:mm_bs()} first produces 200 bootstrap replicates of the
means of the two variables contained in {cmd:x}. {cmd:mm_bs_report()} then reports the "observed"
values, i.e. the means of the two variables in {cmd:x} (first row) and the
bootstrap standard errors of the means (second row). The second call
of {cmd:mm_bs_report()} displays the 95% normal-approximation
confidence intervals for the two means (lower bound in first row,
upper bound in second row).


{marker r2}{pstd}{ul:{it:BCa confidence intervals}}

{pstd}Bias-corrected and accelerated confidence intervals require the
user to provide jackknife replicates of the parameter estimates. Use
the {helpb mf_mm_jk:mm_jk()} function to compute these
statistics. Example:

        {com}: J = mm_jk(&mean(), x, 1)
        {res}{txt}
        Jackknife replications ({res}75{txt})
        {txt}{hline 4}{c +}{hline 3} 1 {hline 3}{c +}{hline 3} 2 {hline 3}{c +}{hline 3} 3 {hline 3}{c +}{hline 3} 4 {hline 3}{c +}{hline 3} 5
        ..................................................    50
        .........................

        {com}: mm_bs_report(B, "bca", 95, 0, J)
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.4468480011   .4088091557{txt}  {c |}
          2 {c |}  {res}.5566050638   .5391249383{txt}  {c |}
            {c BLC}{hline 29}{c BRC}{txt}


{marker r3}{pstd}{ul:{it:Percentile-t confidence intervals}}

{pstd}The formula for the (1-alpha) percentile-t confidence interval is

        [ b - t(1-alpha/2) * se, b - t(alpha/2) * se ]

{pstd}where b is the original parameter estimate, se is an estimate of
the standard error of b and t({it:p}) is the {it:p}-quantile of
the asymptotically pivotal statistic

        t_i = (b_i - b)/se_i,  i = 1, ..., B

{pstd}where the i indicates the bootstrap replicates.

{pstd}To enable the computation of percentile-t confidence intervals,
function {it:f} provided to {cmd:mm_bs()} must return standard
error estimates along with the point estimates of the parameters
(point estimates in the first row, standard errors in the second
row). The following
example illustrates the procedure using the textbook formula
for the standard error of the mean:

        {com}: real matrix meanse(x, w)
        > {c -(}
        >         if (w!=1) _error(3498, "w must be 1")
        >         return(mean(x, 1) \
        >          sqrt(diagonal(variance(x, 1))'/rows(x)))
        > {c )-}

        : meanse(x,1)
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.5059552237   .4677170853{txt}  {c |}
          2 {c |}  {res}.0326460175   .0347510918{txt}  {c |}
            {c BLC}{hline 29}{c BRC}

        {com}: B = mm_bs(&meanse(), x, 1, 200)
        {res}{txt}
        Bootstrap replications ({res}200{txt})
        {txt}{hline 4}{c +}{hline 3} 1 {hline 3}{c +}{hline 3} 2 {hline 3}{c +}{hline 3} 3 {hline 3}{c +}{hline 3} 4 {hline 3}{c +}{hline 3} 5
        ..................................................    50
        ..................................................   100
        ..................................................   150
        ..................................................   200

        {com}: mm_bs_report(B, "t")
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.4442775319    .406048357{txt}  {c |}
          2 {c |}  {res}.5645887108    .537843043{txt}  {c |}
            {c BLC}{hline 29}{c BRC}{txt}

{pstd}An alternative approach may be to
compute the standard errors by the bootstrap, i.e. to perform
bootstrap-within-bootstrap or nested bootstrap. Example:

        {com}: real matrix meanbsse(x, w)
        > {c -(}
        >         if (w!=1) _error(3498, "w must be 1")
        >         return(mean(x, 1) \
        >          mm_bs_report(mm_bs(x,1,&mean(),50,0,1),"se"))
        > {c )-}

        : meanbsse(x,1)
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res}.5059552237   .4677170853{txt}  {c |}
          2 {c |}  {res}.0294400659   .0297055941{txt}  {c |}
            {c BLC}{hline 29}{c BRC}

        {com}: B = mm_bs(&meanbsse(), x, 1, 200)
        {res}{txt}
        Bootstrap replications ({res}200{txt})
        {txt}{hline 4}{c +}{hline 3} 1 {hline 3}{c +}{hline 3} 2 {hline 3}{c +}{hline 3} 3 {hline 3}{c +}{hline 3} 4 {hline 3}{c +}{hline 3} 5
        ..................................................    50
        ..................................................   100
        ..................................................   150
        ..................................................   200

        {com}: mm_bs_report(B, "t")
        {res}       {txt}          1             2
            {c TLC}{hline 29}{c TRC}
          1 {c |}  {res} .438978656   .4096034266{txt}  {c |}
          2 {c |}  {res}.5833692536    .552163646{txt}  {c |}
            {c BLC}{hline 29}{c BRC}{txt}


{marker r4}{pstd}{ul:{it:Methods and formulas}}

{pstd}The formulas for the percentile-t confidence intervals can be
found above. Also see, e.g., Poi (2004).

{pstd}The basic confidence interval is defined as

        [ 2*b - q(1-alpha/2), 2*b - q(alpha/2) ]

{pstd}where b is the original parameter estimate and q({it:p}) is the
{it:p}-quantile of the bootstrap distribution of b (see, e.g., Davison and Hinkley
1997).

{pstd}For all other formulas see {bf:[R] bootstrap}. Note that, other than
indicated in {bf:[R] bootstrap}, 1/k is used instead of
1/(k-1) in the mean squared errors formula (the same is true for
official Stata's {cmd:bootstrap}).


{title:Conformability}

{pstd}
{cmd:mm_bs(}{it:f}{cmd:,} {it:X}{cmd:,} {it:w}{cmd:,}
{it:reps}{cmd:,} {it:d}{cmd:,} {it:nodots}{cmd:,}
{it:strata}{cmd:,} {it:cluster}{cmd:,} {it:stat}{cmd:,}
{it:...}{cmd:)},
{p_end}
{pstd}
{cmd:mm_bs2(}{it:f}{cmd:,} {it:X}{cmd:,} {it:w}{cmd:,}
{it:reps}{cmd:,} {it:d}{cmd:,} {it:nodots}{cmd:,}
{it:strata}{cmd:,} {it:cluster}{cmd:,} {it:stat}{cmd:,}
{it:...}{cmd:)}:
{p_end}
           {it:f}:  1 {it:x} 1
           {it:X}:  {it:n x k}
           {it:w}:  {it:n x} 1 or 1 {it:x} 1
        {it:reps}:  1 {it:x} 1
           {it:d}:  1 {it:x} 1
      {it:nodots}:  1 {it:x} 1
      {it:strata}:  {it:n x} 1 or {it:strata}=.
     {it:cluster}:  {it:n x} 1 or {it:cluster}=.
        {it:stat}:  {it:m x p}, m>0, or {it:stat}=.
         {it:...}:  (depending on {it:f})
      {it:result}:  {it:struct mm_bsstats}

{pstd}
{cmd:(*}{it:f}{cmd:)(}{it:X}{cmd:,} {it:w}{cmd:,} {it:...}{cmd:)}:
{p_end}
           {it:X}:  {it:n x k}
           {it:w}:  {it:n x} 1 or 1 {it:x} 1
         {it:...}:  (depending on {it:f})
      {it:result}:  {it:m x p}, m>0

{pstd}
{cmd:mm_bs_report(}{it:bs}{cmd:,}
{it:what}{cmd:,} {it:level}{cmd:,}
{it:mse}{com:,} {it:jk}{cmd:)}:
{p_end}
          {it:bs}:  {it:struct mm_bsstats}
        {it:what}:  {it:s x} 1 or 1 {it:x s}
       {it:level}:  1 {it:x} 1
         {it:mse}:  1 {it:x} 1
          {it:jk}:  {it:struct mm_jkstats} or {it:jk}=.
      {it:result}:  {it:r x p}


{title:Diagnostics}

{pstd}{cmd:mm_bs()} and {cmd:mm_bs2()} cannot be used with built-in
functions (use wrappers).

{pstd}{cmd:mm_bs_report()} will abort with error if {it:what}
contains {cmd:"bca"} and no jackknife estimates are provided.

{pstd}{cmd:mm_bs_report()} with {it:what}
containing {cmd:"t"} will abort with error if applied to
bootstrap replicates that do not contain information on standard
errors.


{title:Source code}

{pstd}
{help moremata_source##mm_bs:mm_bs.mata}


{title:References}

{phang}Davison, A.C., Hinkley, D.V. (1997). Bootstrap Methods and
Their Application. Cambridge University Press.

{phang}Poi, B.P. (2004). From the help desk: Some bootstrapping
techniques. Stata Journal 4(3):312-328.


{title:Author}

{pstd} Ben Jann, University of Bern, jann@soz.unibe.ch


{title:Also see}

{psee}
Online:  help for
{bf:{help bootstrap}},
{bf:{help mf_mm_jk:mm_jk()}},
{bf:{help moremata}}
{p_end}