```-------------------------------------------------------------------------------
help for samplepps       Stephen P. Jenkins (June 2005, help revised May 2008))
-------------------------------------------------------------------------------

Draw random sample, proportional to size, of n cases

samplepps newvar [if exp] [in range] , ncases(integer) size(sizevar)
[withrepl ]

Description

samplepps draws a random sample with ncases observations from the current
data set, with probabilities proportional to size (`pps'). The default is
to select cases without replacement; optionally cases may be selected with
replacement.

If sampling is without replacement, the variable newvar is equal to 1 for
selected cases, and 0 for non-selected cases. The program returns an error
if either the number of cases to be selected is greater than the number of
valid observations, or if any observation has newvar/(SUM_i newvar) >=
1/ncases.

If sampling is with replacement, the variable newvar is equal to a positive
integer for selected cases (the integer is the number of times the case has
been selected), and 0 for non-selected cases. For both types of sampling,
newvar is missing if sizevar is missing.

If you are serious about drawing random samples, you must first set the
random number seed; see generate.

Methods for sampling with probabilities proportional to size are discussed
by Lohr (1999). See also Levy and Lemeshow (1991, chapter 11) and Som
(1973, chapter 5), who focus on the with-replacement case. The algorithm
used by samplepps for the with-replacement case is the standard `cumulative
method'.  For the without-replacement case, I used an algorithm described
by Jean-Yves Pip Courbois (formerly at the University of Washington),
orginally due to Madow (1949).  For more details, see Brewer and Hanif
(1983) and Cochran (1977, p. 265) who cites Hartley and Rao (1962) and

Options

ncases(integer) specifies the number of observations to be selected.

size(sizevar) specifies the name of the existing variable summarizing
`size'.

withrepl specifies selection with replacement. (If the option is specified,
a given obs may be selected more than once.)

Saved results

r(ncases) is the integer ncases.

r(nobs) is the number of valid observations at risk of being sampled.

r(sizevar) contains the name sizevar.

r(withrepl) = 1 if the with-replacement option was specified.

r(sample) contains the name newvar.

Examples

. // select a sample of schools with selection probabilities depending
on # pupils per school.

. use schools.dta, clear

. set seed 123517

. samplepps pick1, size(n_pupils) n(100)

. samplepps pick2, size(n_pupils) n(50) withrepl

Acknowledgements

Program written with support of ESRC grant number RES-000-22-0995 ("Social
segregation in UK schools: benchmarking with international comparisons").
For helpful discussions, I thank project colleagues John Micklewright and
Sylke Schnepf, and also Philippe Van Kerm. Steven Samuels due my attention
to the references by Cochran, Hartley and Rao, and Madow. Ben Jann drew my
attention to the Brewer and Hanif reference.

Author

Stephen P. Jenkins, ISER, University of Essex, U.K.
<stephenj@essex.ac.uk>

References

Brewer, K. R. W. and Muhammad Hanif. 1983. Sampling with Unequal
Probabilities. New York: Springer.

Cochran, William G. 1977. Sampling Techniques, 3rd Edition. New York:
Wiley.

Madow, William G. 1949. On the theory of systematic sampling. II. Annals of
Mathematical Statistics, 19: 535-545.

Hartley, H.O. and J.N.K. Rao. 1962. Sampling with unequal probabilities and
without replacement.  Annals of Mathematical Statistics, 33: 350-374.

Levy, Paul S. and Stanley Lemeshow. 1991. Sampling of Populations: Methods
and Applications, 2nd edition.  New York: John Wiley and Sons.

Lohr, Sharon L. 1999. Sampling: Design and Analysis. Pacific Grove CA:
Duxbury Press.

Som, Ranjan K. 1973. Practical Sampling Techniques, second edition, revised
and expanded.  New York: Marcel Dekker.

Also see

Manual:  [S-Z] sample

On-line:  help for sample, and gsample if installed.

```