```.-
help for ^rc_spline^,
.-

Restricted Cubic Splines
------------------------

^rc_spline^ xvar [fw] [^if^ exp] [^in^ range][,^nk^nots^(^#^)^ ^kn^ots^(^nu
> mlist^)^]

Description
-----------

^rc_spline^ creates variables that can be used for regression models
in which the linear predictor f(xvar) is assumed to equal a restricted cubic
spline function of an independent variable xvar.  In these regressions, the
user explicitly or implicitly specifies k knots located at xvar = t1, t2, ...,
> tk.
f(xvar) is defined to be a continuous smooth function that is linear before t1,
is a piecewise cubic polynomial between adjacent knots, and is linear after
tk.  See Harrell (2001) for additional details.

^rc_spline^ creates variables called _Sxvar1, _Sxvar2, ..., _Sxvar(k-1), where
"xvar" is the input variable name.  There are always one fewer variables create
> d
than there are knots. If the model has k parameters beta0, beta1, ... , beta(k-
> 1)
then

f(xvar) = beta0 + beta1*_Sxvar1 + beta2*_Sxvar2 + ... + beta(k-1)*_Sxvar(k-1).

An important aspect of restricted cubic splines is that the variables _Sxvar1,
... , _Sxvar(k-1) are functions of xvar and the knots only and are not affected
>  by
the response variable.  This means that we can use ^rc_spline^ to define the
_Sxvar* variables before specifying the response variable or the type of regres
> sion
model.

Restricted cubic splines are also called natural splines.

Options
-------

^nknots^ specifies the number of  knots.

^knots^ specifies the exact location of the  knots.  The values of these knots
> must
be given in increasing order.

If both of these options are given they must both specify the same number on kn
> ots.
When ^knots^ is omitted the default knot values are chosen according to Table 2
> .3 of
Harrell (2001) with the additional restriction that the smallest knot may not b
> e
less than the 5th smallest value of xvar and the largest knot may not be greate
> r than
the 5th largest value of xvar.  The values of the all knots are displayed.  Whe
> n
^knots^ is omitted the number of knots specified by ^nknots^ must be between 3
> and 7.
The default number of knots when neither ^nknots^ nor ^knots^ is given is 5.

Frequency weights are allowed.

Examples
--------

*
* Perform a linear regression of y against a restricted cubic spline (RCS)
>
* function of x with 5 knots.
*
. rc_spline x
. regress y _Sx1 _Sx2 _Sx3 _Sx4
*
* Perform a logistic regression of fate against the RCS function of x defin
> ed above.
*
. logistic fate _S*

*
* Perform a linear regression of y against a RCS of x with 3 knots chosen
* at their default values according to Harrell (2001).  Graph the observed
* and expected values of y against x
*
. drop _S*
. rc_spline x, nknots(3)
. regress y _S*
. predict yhat
. scatter y x || line yhat x

*
* Perform a proportional hazard regression analysis of fate against a RCS
* function of x with four knots specified at x =  2, 4, 6 and 8.
*
. drop _S*
. stset time, failure(fate)
. rc_spline x, knots(2 (2) 8)
. stcox _S*

Remarks
-------

Restricted cubic splines provide a fairly general and robust approach for adapt
> ing linear
methods to model non-linear relationships between a response variable and one o
> r more
continuous covariates.  They can often be used effectively as an alternative to
>  converting
continuous to categorical variables, which results in the discarding of informa
> tion.
See Harrell (2001) for arguments in favor of this approach and guidance on how
> to build
models with RCSs.

This program is similar to ^spline^ (Sasieni 1994).  It differs in the choice o
> f default
knots and in its output.  ^spline^ requires the user to specify a response and
> independent
variable.  It then allows the user to specify a number of different regression
> models and
version 7 graphs.  In contrast, ^rc_spline^ only calculates the RCS covariates.
>   However,
this allows the use of the full range and power of Stata's regression, post-
estimation and v.8 graph commands.  In particular, more sophisticated
residual analyses and graphs can be generated as well as multiple regression mo
> dels
involving more than one independent variable.

See also ^mkspline^ for fitting models involving linear splines.

Authors
-------

William D. Dupont
W. Dale Plummer, Jr.
Department of Biostatistics
Vanderbilt University School of Medicine
Nashville, TN 37232-2158

e-mail:  william.dupont@vanderbilt.edu
dale.plummer@vanderbilt.edu

References
----------

Harrell, F.E: Regression Modeling Strategies with Applications to Linear Models
> , Logistic
Regression and Survival Analysis. New York: Springer-Verlag 2001.

Sasieni, P: Natural cubic splines STB reprints. 1994; 4: 19-22.  See also STB r
> eprints
1995; 4:174, and  package snp7_1 from http://www.stata.com/stb/stb24.

```