help mfpi, help mfpi_plot


mfpi -- Modelling interactions between categorical and continuous/categorical covariates


mfpi [, mfpi_options] : regression_cmd [yvar1 [yvar2]] xvarlist [if] [in] [weight] [, regression_cmd_options]

mfpi_plot varname [if] [in] [ , mfpi_plot_options ]

mfpi_options Description ------------------------------------------------------------------------- adjust(xvarlist2) adjust linearly for xvarlist2 all include out-of-sample observations in generated variables center(center_list) determine centering for FP-transformed variables detail give additional details of fitted models df(df_list) set up degrees of freedom for each predictor in xvarlist flex(#) define flexibility of main-effects and interaction models fp1(fp1_zvarlist) use fractional polynomials (FPs) of degree 1 for main effects and interactions with fp1_zvarlist fp2(fp2_zvarlist) use FPs of degree 2 for main effects and interactions with fp2_zvarlist gendiff(stubname) generate new variable(s) that contains the difference between the estimated functions genf(string) generate new variables that contain fitted functions linear(linear_zvarlist) define list of variables whose linear interaction with trt_var is to be investigated mfpopts(mfp_options) additional options for mfp noscaling suppress scaling of continuous variables in FP transformations noci suppress standard errors and confidence intervals for gendiff() and genf() options select(select_list) set significance levels for variable selection in xvarlist showmodel show variables and FP transformations selected from xvarlist by mfp treatment(trt_var) required - defines the categorical 'treatment' variable regression_cmd_options options for regression_cmd -------------------------------------------------------------------------


regression_cmd may be clogit, cnreg, glm, intreg, logistic, logit, mlogit, nbreg, ologit, oprobit, poisson, probit, qreg, regress, rreg, stcox, stcrreg, stpm2 (if installed), streg, xtgee


varname for mfpi_plot must be an existing variable. Typically, varname is a member of linear_zvarlist, fp1_zvarlist or fp2_zvarlist. See also the remark on Working with composite variables for details of plotting with composite predictors, such as spline basis functions or fractional polynomials, which can be done using a special feature of linear_zvarlists.

mfpi_plot_options Description ------------------------------------------------------------------------- stubname(stubname) required if the gendiff() option of mfpi was not specified vn(#) variable number in linear(), fp1(), and fp2() level(#) select level of trt_var to compare with base level plot(plot) add other plots to generated graph graph_options options for graph twoway -------------------------------------------------------------------------

All weight types supported by regression_cmd are allowed; see weight.

yvar is not allowed for streg, stcox and stpm2; for these commands, you must first stset your data.


mfpi is designed to investigate the interaction of a categorical covariate (trt_var) with covariate(s) specified by any combination of the fp1(), fp2() and linear() options. Whether or not trt_var is specified as a factor variable, it is handled as one. Typically, trt_var is the treatment indicator variable in a randomized controlled trial of one or more treatments against a control arm, with the lowest value signifying the control arm.

mfpi adjusts interactions with trt_var for variables in xvarlist (which may be subjected to FP transformation selected by mfp) and/or linearly for variables in xvarlist2.

mfpi_plot produces a treatment-effect plot derived from variables saved with the gendiff() option of mfpi. A treatment-effect plot shows how the estimated treatment effect varies with a continuous covariate when an interaction between the two has been modelled.

Options for mfpi

adjust(xvarlist2) includes xvarlist2 as covariates with a linear functional form in all the fitted models.

all allows prediction (see the gendiff() and genf() options) for all available cases, irrespective of exclusion by if, in, or weights.

center(center_list) determines the centering of FP-transformed variables. The most likely requirement is to suppress centering for all such variables, which is done with center(no). The default behavior is centering on the mean of each continuous predictor. For further details, see the description of the center() option of fracpoly.

detail gives additional details of the fitted models.

df(df_list) sets up the degrees of freedom (df) for each predictor in xvarlist. The df (not counting the regression constant, _cons) are twice the degree of the FP, so, for example, a member of xvarlist fitted as a second-degree FP (FP2) has 4 df. The first item in df_list can be either # or varlist:#. Subsequent items must be varlist:#. Items are separated by commas, and varlist is specified in the usual way for variables. With the first type of item, the df for all predictors are taken to be #. With the second type of item, all members of varlist (which must be a subset of xvarlist) have # df.

The default df for a predictor of type varlist specified in xvarlist but not in df_list are assigned according to the number of distinct (unique) values of the predictor, as follows:

--------------------------------------------------------------- No. of distinct values Default df --------------------------------------------------------------- 1 (invalid -- covariate has variance 0) 2-3 1 4-5 min(2, dfdefault(#)) >=6 dfdefault(#) --------------------------------------------------------------- dfdefault(#) is an option of mfp (see the mfpopts() option and mfp); its default # is 4, meaning an FP2 function.

Example: df(4) All variables have 4 df.

Example: df(2, weight displ:4) weight and displ have 4 df, and all other variables have 2 df.

Example: df(weight displ:4, mpg:2) weight and displ have 4 df, mpg has 2 df, and all other variables have the default of 1 df.

Example: df(weight displ:4, 2) All variables have 2 df because the final 2 overrides the earlier 4.

flex(#) defines the flexibility of the main-effects and interaction models; # = 1 is the least flexible and # = 4 is the most flexible. The default is flex(1). See Remarks for further details.

fp1(fp1_zvarlist) defines a list of continuous variables whose interactions with trt_var are to be investigated by fitting FP functions of degree 1 (i.e., FP1 functions) to each member of fp1_zvarlist in turn, at each level of trt_var. Also see flex().

fp2(fp2_zvarlist) defines a list of continuous variables whose interactions with trt_var are to be investigated by fitting FP functions of degree 2 (i.e., FP2 functions) to each member of fp2_zvarlist in turn, at each level of trt_var. Also see flex().

gendiff(stubname) generates a new variable(s) called stubname#_j that contains fj-f0, the difference between the estimated functions (at level j minus level 0 of trt_var), for the #th member of the list composed of linear_zvarlist, fp1_zvarlist, and fp2_zvarlist, in that order. The difference fj-f0 is an estimate of the covariate-specific effect of level j compared with level 0 (e.g., the covariate-specific treatment effect). gendiff() also creates new variables called stubname#s_j, which contains the standard error of fj-f0, and stubname#lb_j and stubname#ub_j, the lower and upper 95% confidence limits, thus providing all the quantities necessary for a treatment-effect plot.

genf(stubname) generates new variables called stubname#_0, stubname#_1, etc., that contain the fitted functions at levels 0, 1, etc., of trt_var, respectively, for the #th member of the list composed of linear_zvarlist, fp1_zvarlist, and fp2_zvarlist, in that order. For variables in fp1_zvarlist and fp2_zvarlist, the same FP transformation is used at each level of trt_var. The estimated function at level 0 of trt_var is adjusted to have mean 0.

linear(linear_zvarlist) defines a list of variables whose linear interactions with trt_var are to be investigated. If a categorical variable in linear_zvarlist has more than 2 levels, the necessary dummy variables must be created and placed between parentheses to indicate that they should be tested together. More properly, linear_zvarlist is a list of variables of which some may be dummies for categorical variables.

For example, linear((who2 who3)) would bind who2 and who3 together to create a predictor with 2 df for its main effect.

For further information, see the remarks in Working with composite variables.

mfpopts(mfp_options) supplies mfp options to mfpi for the creation of the adjustment model from xvarlist.

noscaling suppresses scaling of all the continuous variables that are subject to FP transformation. The default is automatic application of scale factors; the behavior can be turned off for all such variables by using noscaling. For further details, see the description of the same option in fracpoly and fracgen.

select(select_list) sets the nominal P-values (significance levels) for variable selection among xvarlist by backward elimination. See the select() option of mfp for further details. A typical usage is select(0.05), which selects all variables in xvarlist that are significant at the 5% level according to a backward stepwise algorithm.

Example: select(0.05) All variables have nominal p-value 5 percent.

Example: select(0.05, weight:1) All variables except weight have nominal p-value 5 percent; weight is forced into the model.

Example: select(a (b c):0.05) All variables except a, b, and c are forced into the model. b and c are tested jointly with 2 df at the 5 percent level, and a is tested singly at the 5 percent level.

showmodel shows the variables selected from xvarlist by mfp, together with their FP powers, where relevant. It also shows the interaction model fitted last. showmodel is a concise alternative to detail.

treatment(trt_var) is required and defines the factor or categorical variable whose interactions with variables specified in linear(), fp1(), or fp2() are of interest. trt_var must have at least two distinct, nonmissing values, but the codes are arbitrary because they are mapped to 0, 1, 2, etc., internally. The category corresponding to the lowest value is taken as the reference category (level 0).

regression_cmd_options are any options for regression_cmd.

Options for mfpi_plot

stubname(stubname) is required if the gendiff() option of mfpi was not specified. If the gendiff() option was specified, then stubname is identified automatically. The stubname() option allows you to produce a treatment-effect plot for a variable whose stubname was used in an earlier run of mfpi.

vn(#) identifies the variable in linear(), fp1(), and fp2(), in that order, whose associated treatment effect is to be plotted. Only one variable can be plotted in a given call to mfpi_plot. For example, if linear(x1 x2) fp1(x1 x3) fp2(x3) was specified, the list of variables is x1 x2 x1 x3 x3, and the total number of variables is 5 (not 3). Thus # would be an integer between 1 and 5. The default is vn(1).

level(#) defines the level of trt_var you wish to compare with the base level. Levels of trt_var are coded 0, 1, 2, etc., with the base (reference) level being 0. These coded levels may or may not coincide with the actual values of trt_var in the data. For example, level(1) denotes the second lowest value of trt_var, which could be any positive integer (not necessarily 1). The default # is 1, meaning compare the second level of trt_var with the first level. Specifying a non-existent level raises a "variable ... not found" type of error message.

plot(plot) provides a way to add other plots to the generated graph; see [G] addplot option.

graph_options are any options of graph twoway, such as xtitle() and ytitle().



The algorithm used in mfpi can be summarized as follows. To fix ideas, it is assumed that trt_var is a binary treatment indicator variable in a randomized controlled clinical trial with two parallel groups. trt_var therefore has two levels, arbitrarily labeled 0 and 1.

We first describe the mfpi algorithm for the default flexibility setting, flex(1), i.e., the least flexible case.

Let Z be the continuous or binary covariate of immediate interest, and let T be the binary treatment variable with values of 0 and 1. Let X be a vector of potentially prognostic factors. In mfpi, Z is a member of linear_zvarlist, fp1_zvarlist, or fp2_zvarlist; T is trt_var; and X is xvarlist. The basic idea is to model the relationship between the outcome and Z at each level of T, adjusting for selected members of X, the latter variables transformed by FP if appropriate. A test for interaction is performed on regression coefficients at the final step. The procedure for a Z in fp2_zvarlist (see the fp2() option) is as follows:

1. Apply the MFP algorithm to X with a p-value threshold of a* for selecting variables and FP transformations. Let X* be the resulting covariate vector, which we will call the adjustment model. X* can include (transformed) variables in X selected by the MFP algorithm. If all variables in X are uninfluential, X* may even be null. No covariate adjustment is then performed. In some cases, X* can be formulated from subject-matter knowledge, avoiding data-driven searching.

2. Find by maximum likelihood the best-fitting FP2 powers (p1,p2) for Z, adjusting for X*.

3. For groups j = 0, 1 and powers p_i (i = 1, 2), define new predictors Z_ji = Z^p_i if T = j, Z_ji = 0 otherwise.

4. The test of T x Z interaction is a likelihood-ratio test between the nested models T, Z_01, Z_02, Z_11, Z_12, X* and T, Z^p_1, Z^p_2, X*. The difference in deviance is compared with chi-squared on 2 df.

5. If an interaction is not found, Z is regarded as a potential prognostic factor only. To investigate if an FP2 function is still needed for Z, the final model is chosen by repeating step 1 but including Z as a potential prognostic factor.

The procedure for a Z in fp1_zvarlist, i.e., using FP1 rather than FP2 functions of Z, is in principle identical. The interaction is tested on 1 df.

If Z is binary or is modeled as linear (i.e., is a member of linear_zvarlist), the approach reduces to the usual procedure to estimate and to test for an interaction adjusted for covariates X*.

Extension to cases where T has more than 2 levels follows immediately.

For further details of the MFP algorithm used to construct X*, see Royston and Ambler (1999), Sauerbrei and Royston (1999), and mfp. Further information on the method and applications is given by Royston and Sauerbrei (2004).

Other flexibility values

The covariates X are treated in the same way for all values of flex(). Only the main effect of Z and its interaction with T are dealt with differently, as follows.

With flex(2), the best-fitting FP powers for Z are found within the context of interaction models. The powers are constrained to be the same for all levels of T. The FP powers for the main effect of Z are taken to be the same as for the Z x T interaction.

flex(3) is the same as flex(2), except that the FP powers for the main effect of Z are determined independently of the interaction, exactly as for flex(1).

flex(4), the most flexible option, calls for FP powers to be estimated for the main effect of Z and also for all levels of T in the interaction context.

Hypothesis testing

The main-effects and interaction models for flex(1) are nested, so standard hypothesis testing may be used. Simulation studies (as yet unpublished) have confirmed that the significance level of flex(1) is close to its nominal value.

flex(2) uses the same FP function for the interaction as for the main effect. At first glance, a standard hypothesis test appears appropriate. However, no allowance has been made for degrees of freedom used to estimate FP powers across the levels of the categorical variable in the interaction model, so the p-values from interaction tests should be regarded as indicative, not definitive.

Similar comments apply to flex(3), except that here the main-effects and interaction models are non-nested (because different FP models may be used for the main effects and interaction). P-values should again be taken as indicative rather than definitive.

The extra flexibility provided by flex(4) is reflected in its additional degrees of freedom. Its significance level is currently under investigation in simulation studies.

Working with composite variables

A useful feature of mfpi is the ability to model, and if appropriate plot, functions of predictors represented by one or more individual but linked variables. For example, a restricted cubic regression spline with 4 knots, 2 interior and 2 boundary, is represented by 3 basis functions. Regression is performed with these basis functions as predictors. Such a set of predictors 'belongs' to just one original continuous predictor and must be handled appropriately. This is done by surrounding in parentheses (()) the linked variables in the linear() option.

For example, suppose that age_1 age_2 age_3 were restricted cubic spline basis variables derived from a predictor called age, e.g. using the Stata command mkspline. Analysis of interaction between treatment and the spline function for age is specified by the option linear((age_ age_2 age_3)):

. mkspline age_ = age, cubic nknots(4)

. mfpi, select(0.05) treatment(treat) linear((age_1 age_2 age_3)): stcox z1 z2 z3

When treat is binary, the interaction in the above example has 3 d.f.

A nice feature of this approach is that the treatment-effect plot created by mfpi_plot works happily with the original variable. Continuing the age example above, we specify the gendiff() option of mfpi to create the variables necessary for the treatment-effect plot and then tell mfi_plot to plot the age-related treatment difference and its 95% pointwise confidence interval against age as follows:

. mfpi, select(0.05) treatment(treat) linear((age_1 age_2 age_3)) gendiff(d): stcox z1 z2 z3

. mfpi_plot age

The same approach works with categorical covariates with several levels, represented by dummy variables. Alternatively, such covariates can be defined as factor variables, for example, linear( This avoids the need to create dummy variables. Factor variables are not allowed to be combined into groups using parentheses.

Note on the base level

mfpi assumes that the lowest value of trt_var represents the base level (reference value). If you want a different value to be the base level, you have to recode trt_var before running mfpi. For example

. recode trt (1=3) (3=1)

would reverse the ordering of the levels 1, 2, 3 of the variable trt. See recode for details.


Excitement engendered by the discovery of interactions should be tempered by proper caution. There is a significant problem of multiplicity and of potential instability of the FP functions found. See Royston and Sauerbrei (2004) for a more detailed discussion of this issue. It is prudent to check whether the functions in each treatment group suggest a cutpoint(s) that can reproduce a difference in response; whether the interaction is present in univariate as well as multivariable models; and whether it is robust to the choice of adjustment model.


. mfpi, select(0.05) showmodel treatment(i.treat) linear(c1 (c2 c3)) fp2(x1 x2): stcox x1 x2 c1 c2 c3

. mfpi, select(1) degree(1) dfdefault(2) treatment(i.treat) linear(c1 (c2 c3)) fp1(x1 x2) fp2(x1 x2) genf(f) gendiff(d): logit y x1 x2 c1 c2 c3

. mfpi, treatment(i.treat) linear( x): regress y

. mkspline age_ = age, cubic nknots(4)

. mfpi, select(0.05) treatment(i.treat) linear((age_1 age_2 age_3)) gendiff(d): stcox z1 z2 z3

. mfpi_plot age

Saved results

In addition to what regression_cmd saves in e(), mfpi saves the some or all of the following in r():

Scalars r(nxvar) number of predictors in xvarlist r(nxf) number of predictors in xvarlist selected by mfp r(totdflin) total degrees of freedom for linear interaction model r(totdffp1) total degrees of freedom for FP1 interaction model r(totdffp2) total degrees of freedom for FP2 interaction model r(devlin) deviance for linear interaction model r(devfp1) deviance for FP1 interaction model r(devfp2) deviance for FP2 interaction model r(aiclin) AIC for linear interaction model r(aicfp1) AIC for FP1 interaction model r(aicfp2) AIC for FP2 interaction model r(chi2lin) chi-squared value for testing linear interaction model r(chi2fp1) chi-squared value for testing FP1 interaction model r(chi2fp2) chi-squared value for testing FP2 interaction model r(Plin) p-value for testing linear interaction model r(Pfp1) p-value for testing FP1 interaction model r(Pfp2) p-value for testing FP2 interaction model

Macros r(varlist) xvarlist r(adjust) xvarlist2 r(treatment) trt_var r(Linear) list of variables in linear() r(Fp1) list of variables in fp1() r(Fp2) list of variables in fp2() r(z) z, the final continuous variable processed r(powmain) FP powers for z, main effect r(powint0) FP powers for z at level 0 of trt_var r(powint`j') FP powers for z at level `j' of trt_var (`j' > 0)


Patrick Royston MRC Clinical Trials Unit London, UK


Royston, P., and G. Ambler. 1999. sg112.1: Nonlinear regression models involving power or exponential functions of covariates: Update. Stata Technical Bulletin 50: 26. Reprinted in Stata Technical Bulletin Reprints, vol. 9, p. 180. College Station, TX: Stata Press.

Royston, P., and W. Sauerbrei. 2004. A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Statistics in Medicine 23: 2509-2525.

Royston, P., and W. Sauerbrei. 2009. Two techniques for investigating interactions between treatment and continuous covariates in clinical trials. Stata Journal 9 (2): 230-251.

Sauerbrei, W., and P. Royston. 1999. Building multivariable prognostic and diagnostic models: Transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society, Series A 162: 71-94.

Also see

Article: Stata Journal, volume 9, number 2: st0164

Manual: [R] fracpoly, [R] mfp

Online: mfp