Title
rcsgen -- Generate restriced cubic splines and derivatives
Syntax
rcsgen varname [if] [in] [, options]
options Description ------------------------------------------------------------------------- Options gen(stub) stubname for generated spline variables dgen(stub) stubname for generated derivatives of spline variables knots(numlist) location of knots percentiles(numlist) location of knots using percentiles df(#) degrees of freedom for knots bknots(numlist) location of boundary knots orthog orthogonalize generates spline variables rmatrix(matname) use supplied matrix for orthogonalization if2(string) use extra condition when generating knots using df or percentile options fw(varname) name of variable containing weights when generating knots using the df or percentile options reverse derives the spline variables in reversed order scalar(#) a single value to calculate the spline basis for
One (and only one) of the knots, percentiles or df options should be specified. If they are not then only 1 variable is created which is a copy of varname.
Description
rcsgen generates basis functions for restricted cubic splines and (optionally) their derivatives. Restriced cubic spline functions assume linearity beyond the two boundary knots. It is possible to specify knots on the original scale, as default percentiles or user specified pecentiles. Orthogonalization can be peformed using Gram-Schmidt orthogonalization. When orthogonalizing, a matrix is returned, which can be useful for regenerating the orthogonalized spline variables for out of sample predictions.
Options
gen(stub) gives a stubname for the generated cubic splines variables. For example, gen(rcs) will create variable rcs1, rcs2, ....
dgen(stub) gives a stubname for the derivatives of the restricted cubic splines variables. For example, dgen(drcs) will create variable drcs1, drcs2, ....
knots(numlist) list of the location of the knots. The boundary knots are included in the numlist.
percentiles(numlist) list of percentiles for the location of the knots. The boundary knots are included in the numlist.
df(#) sets the desired degrees of freedom (df). The number of knots is one less than the df. Knots are placed at equally spaced centiles of the distribution of varname. For example, for df(5) knots are placed at the 20th, 40th, 60th, 80th centiles of the distribution of varname. In addition boundary knots are placed at the maximum and minimum values of varname or those specified using the bknots() option.
bknots(numlist) list of boundary knots when using the df() option. By default these are the minimum and maximum of the varname
orthog will orthogonalize the generated spline variables using Gram-Schmidt orthogonalization.
rmatrix(matname) will orthogonalize the generated spline variables using the supplied R matrix. If X is the N*p matrix of untransformed spline variables and Q is the N*(p+1) matrix of orthogonlized variables plus a column of ones, then X=QR.
if2(condition) supplies a condition when generating the knots using the df or percentile options. For example in survival (time-to-event) data when using splines for the time scale it is common to calculate the knot locations based on the distribution of uncensored event times.
fw(weight) gives the name of the variable containing weights when generating knots using df or percentile options.
reverse will make the spline variables to be derived in reversed order, treating the last knot as the first and the first knot as the last. This can be used to add a constraint to a regression model for a constant effect after the last knot.
scalar will calculate the spline variables for a single value and store the results in a series of Stata scalars. It is useful when obtaining in or out of sample predictions in large datasets and you want to predict at a certain value of varname.
Example: You can specify where to position the knots.
. rcsgen x, knots(10 30 50 70 90) gen(rcs)
Alternatively, you can generate the knots positions according to the distribution of varname. In the example below the df(3) option is used which means that 4 knots are used at 0th 33rd 67th and 100th centiles of weight.
. sysuse auto, clear . rcsgen weight, gen(rcs) df(3) . regress mpg rcs1-rcs3 . predictnl pred = xb(), ci(lci uci) . twoway (rarea lci uci weight, sort) /// (scatter mpg weight, sort) /// (line pred weight, sort lcolor(black)), legend(off) (click to run)
Authors
This command is based on rcs written by Chris Nelson (cn46@le.ac.uk) that comes with the strsrcs command available from SSC.
Paul Lambert (paul.lambert@le.ac.uk) added the percentile, rmatrix, if2, fw and scalar options. He also wrote the mata code for the Gram-Schmidt orthogonalization (as opposed to using the orthog command).
Mark Rutherford (mjr40@le.ac.uk) added the df and bknots options.
Therese Andersson (therese.m-l.andersson@ki.se) added the reverse option. Also see