Restricted Cubic Splines ------------------------
^rc_spline^ xvar [fw] [^if^ exp] [^in^ range][,^nk^nots^(^#^)^ ^kn^ots^(^nu > mlist^)^]
^rc_spline^ creates variables that can be used for regression models in which the linear predictor f(xvar) is assumed to equal a restricted cubic spline function of an independent variable xvar. In these regressions, the user explicitly or implicitly specifies k knots located at xvar = t1, t2, ..., > tk. f(xvar) is defined to be a continuous smooth function that is linear before t1, is a piecewise cubic polynomial between adjacent knots, and is linear after tk. See Harrell (2001) for additional details.
^rc_spline^ creates variables called _Sxvar1, _Sxvar2, ..., _Sxvar(k-1), where "xvar" is the input variable name. There are always one fewer variables create > d than there are knots. If the model has k parameters beta0, beta1, ... , beta(k- > 1) then
f(xvar) = beta0 + beta1*_Sxvar1 + beta2*_Sxvar2 + ... + beta(k-1)*_Sxvar(k-1).
An important aspect of restricted cubic splines is that the variables _Sxvar1, ... , _Sxvar(k-1) are functions of xvar and the knots only and are not affected > by the response variable. This means that we can use ^rc_spline^ to define the _Sxvar* variables before specifying the response variable or the type of regres > sion model.
Restricted cubic splines are also called natural splines.
^nknots^ specifies the number of knots.
^knots^ specifies the exact location of the knots. The values of these knots > must be given in increasing order.
If both of these options are given they must both specify the same number on kn > ots. When ^knots^ is omitted the default knot values are chosen according to Table 2 > .3 of Harrell (2001) with the additional restriction that the smallest knot may not b > e less than the 5th smallest value of xvar and the largest knot may not be greate > r than the 5th largest value of xvar. The values of the all knots are displayed. Whe > n ^knots^ is omitted the number of knots specified by ^nknots^ must be between 3 > and 7. The default number of knots when neither ^nknots^ nor ^knots^ is given is 5.
Frequency weights are allowed.
* * Perform a linear regression of y against a restricted cubic spline (RCS) > * function of x with 5 knots. * . rc_spline x . regress y _Sx1 _Sx2 _Sx3 _Sx4 * * Perform a logistic regression of fate against the RCS function of x defin > ed above. * . logistic fate _S* * * Perform a linear regression of y against a RCS of x with 3 knots chosen * at their default values according to Harrell (2001). Graph the observed * and expected values of y against x * . drop _S* . rc_spline x, nknots(3) . regress y _S* . predict yhat . scatter y x || line yhat x * * Perform a proportional hazard regression analysis of fate against a RCS * function of x with four knots specified at x = 2, 4, 6 and 8. * . drop _S* . stset time, failure(fate) . rc_spline x, knots(2 (2) 8) . stcox _S* Remarks -------
Restricted cubic splines provide a fairly general and robust approach for adapt > ing linear methods to model non-linear relationships between a response variable and one o > r more continuous covariates. They can often be used effectively as an alternative to > converting continuous to categorical variables, which results in the discarding of informa > tion. See Harrell (2001) for arguments in favor of this approach and guidance on how > to build models with RCSs.
This program is similar to ^spline^ (Sasieni 1994). It differs in the choice o > f default knots and in its output. ^spline^ requires the user to specify a response and > independent variable. It then allows the user to specify a number of different regression > models and version 7 graphs. In contrast, ^rc_spline^ only calculates the RCS covariates. > However, this allows the use of the full range and power of Stata's regression, post- estimation and v.8 graph commands. In particular, more sophisticated residual analyses and graphs can be generated as well as multiple regression mo > dels involving more than one independent variable.
See also ^mkspline^ for fitting models involving linear splines.
William D. Dupont W. Dale Plummer, Jr. Department of Biostatistics Vanderbilt University School of Medicine Nashville, TN 37232-2158
e-mail: email@example.com firstname.lastname@example.org
Harrell, F.E: Regression Modeling Strategies with Applications to Linear Models > , Logistic Regression and Survival Analysis. New York: Springer-Verlag 2001.
Sasieni, P: Natural cubic splines STB reprints. 1994; 4: 19-22. See also STB r > eprints 1995; 4:174, and package snp7_1 from http://www.stata.com/stb/stb24.