Interaction

advertisement
Patrick Royston
MRC Clinical Trials Unit,
London, UK
Willi Sauerbrei
Institut of Medical Biometry and Informatics
University Medical Center Freiburg, Germany
Interactions With Continuous Variables –
Extensions of the Multivariable Fractional
Polynomial Approach
Overview
• Issues in regression models
• (Multivariable) fractional polynomials
(MFP)
• Interactions of continuous variable with
– Binary variable
– Continuous variable
– Time
• Summary
2
Observational Studies
Several variables, mix of continuous and (ordered) categorical
variables
Different situations:
– prediction
– explanation
Explanation is the main interest here:
• Identify variables with (strong) influence on the outcome
• Determine functional form (roughly) for continuous variables
The issues are very similar in different types of regression
models (linear regression model, GLM, survival models ...)
Use subject-matter knowledge for modelling ...
... but for some variables, data-driven choice inevitable
3
Regression models
X=(X1, ...,Xp) covariate, prognostic factors
g(x) = ß1 X1 + ß2 X2 +...+ ßp Xp (assuming effects are linear)
normal errors (linear) regression model
Y normally distributed
E (Y|X) = ß0 + g(X)
Var (Y|X) = σ2I
logistic regression model
Y binary
Logit P (Y|X) = ln
P(Y  1 X )
P(Y  0 X )
 β 0  g(X)
survival times
T survival time (partly censored)
Incorporation of covariates
λ(t X )  λ 0 (t)exp
(g(X))
4
Central issue
To select or not to select (full model)?
Which variables to include?
5
Continuous variables – The problem
“Quantifying epidemiologic risk factors using nonparametric regression: model selection remains
the greatest challenge”
Rosenberg PS et al, Statistics in Medicine 2003; 22:3369-3381
Discussion of issues in (univariate) modelling with
splines
Trivial nowadays to fit almost any model
To choose a good model is much harder
6
Alcohol consumption as risk factor for oral cancer
Rosenberg et al, StatMed 2003
7
Building multivariable regression models
Before dealing with the functional form, the
‚easier‘ problem of model selection:
variable selection assuming that the effect
of each continuous variable is linear
8
Multivariable models - methods for variable
selection
Full model
– variance inflation in the case of multicollinearity
Stepwise procedures  prespecified (in, out) and
actual significance level?
• forward selection (FS)
• stepwise selection (StS)
• backward elimination (BE)
All subset selection  which criteria?
• Cp
• AIC
• BIC
Mallows
Akaike Information Criterion
Bayes Information Criterion
=
(SSE / σˆ)2 - n+ p 2
= n ln (SSE / n)
+p2
= n ln (SSE / n)
+ p ln(n)
fit
penalty
Combining selection with Shrinkage
Bayes variable selection
Recommendations???
Central issue: MORE OR LESS COMPLEX MODELS?
9
Backward elimination
is a sensible approach
- Significance level can be chosen
- Reduces overfitting
Of course required
• Checks
• Sensitivity analysis
• Stability analysis
10
Continuous variables – what functional
form?
Traditional approaches
a) Linear function
- may be inadequate functional form
- misspecification of functional form may lead to
wrong conclusions
b) ‘best‘ ‘standard‘ transformation
c) Step function (categorial data)
- Loss of information
- How many cutpoints?
- Which cutpoints?
- Bias introduced by outcome-dependent choice
11
StatMed 2006, 25:127-141
12
Continuous variables – newer approaches
• ‘Non-parametric’ (local-influence) models
– Locally weighted (kernel) fits (e.g. lowess)
– Regression splines
– Smoothing splines
• Parametric (non-local influence) models
– Polynomials
– Non-linear curves
– Fractional polynomials
• Intermediate between polynomials and non-linear curves
13
Fractional polynomial models
• Describe for one covariate, X
• Fractional polynomial of degree m for X with powers p1,
… , pm is given by
FPm(X) = 1 X p1 + … + m X pm
• Powers p1,…, pm are taken from a special set
{2,  1,  0.5, 0, 0.5, 1, 2, 3}
• Usually m = 1 or m = 2 is sufficient for a good fit
• Repeated powers (p1=p2)
1 X p1 + 2 X p1log X
• 8 FP1, 36 FP2 models
14
Examples of FP2 curves
- varying powers
(-2, 1)
(-2, 2)
(-2, -2)
(-2, -1)
15
Examples of FP2 curves
- single power, different coefficients
(-2, 2)
4
Y
2
0
-2
-4
10
20
30
x
40
50
16
Our philosophy of function selection
• Prefer simple (linear) model
• Use more complex (non-linear) FP1 or FP2
model if indicated by the data
• Contrasts to more local regression modelling
– Already starts with a complex model
17
Example: Prognostic factors
GBSG-study in node-positive breast cancer
299 events for recurrence-free survival time (RFS) in
686 patients with complete data
7 prognostic factors, of which 5 are continuous
18
FP analysis for the effect of age
Degree 1
Power Model
chisquare
-2
6.41
-1
3.39
-0.5
2.32
0
1.53
0.5
0.97
1
0.58
2
0.17
3
0.03
Powers
-2
-2
-2
-2
-2
-2
-2
-2
-1
-1
-1
-1
-2
-1
-0.5
0
0.5
1
2
3
-1
-0.5
0
0.5
Model
chisquare
17.09
17.57
17.61
17.52
17.30
16.97
16.04
14.91
17.58
17.30
16.85
16.25
Degree 2
Powers
Model Powers
chisquare
-1
1
15.56
0
2
-1
2
13.99
0
3
-1
3
12.37 0.5 0.5
-0.5 -0.5
16.82 0.5 1
-0.5
0
16.18 0.5 2
-0.5
0.5
15.41 0.5 3
-0.5
1
14.55
1
1
-0.5
2
12.74
1
2
-0.5
3
10.98
1
3
0
0
15.36
2
2
0
0.5
14.43
2
3
0
1
13.44
3
3
Model
chisquare
11.45
9.61
13.37
12.29
10.19
8.32
11.14
8.99
7.15
6.87
5.17
3.67
7
19
Function selection procedure (FSP)
Effect of age at 5% level?
χ2
df
p-value
Any effect?
Best FP2 versus null
17.61
4
0.0015
Linear function suitable?
Best FP2 versus linear
17.03
3
0.0007
FP1 sufficient?
Best FP2 vs. best FP1
11.20
2
0.0037
20
Many predictors – MFP
With many continuous predictors selection of best
FP for each becomes more difficult  MFP
algorithm as a standardized way to variable and
function selection
(usually binary and categorical variables are also
available)
MFP algorithm combines
backward elimination with
FP function selection procedures
21
Continuous factors
Different results with different analyses
Age as prognostic factor in breast cancer (adjusted)
P-value
0.9
0.2
0.001
22
Results similar?
Nodes as prognostic factor in breast cancer (adjusted)
P-value
0.001
0.001
0.001
23
Multivariable FP
Final Model in breast cancer example


2
 0 .5
0 .5
f  X 1 , X 1 ; X 4 a ; exp   0.12 X 5 ; X 6 


age
grade
nodes
progesterone
Model choosen out of
Model
more than a million possible models,
one model selected
- Sensible?
- Interpretable?
- Stable?
Bootstrap stability analysis (see R & S 2003)
24
Example: Risk factors
• Whitehall 1
– 17,370 male Civil Servants aged 40-64 years,
1670 (9.7%) died
– Measurements include: age, cigarette smoking, BP,
cholesterol, height, weight, job grade
– Outcomes of interest: all-cause mortality at 10 years 
logistic regression
25
Whitehall 1
Systolic blood pressure
Deviance difference in comparison to a straight line for FP(1)
and FP(2) models
First degree
Power Deviance
Difference
p
-74.19
-2
-43.15
-1
-29.40
-0.5
-17.37
0
-7.45
0.5
0.00
1
6.43*
2
0.98
3
Powers
q
p
-2
-2
-1
-2
-2 -0.5
0
-2
0.5
-2
1
-2
2
-2
3
-2
-1
-1
-1 -0.5
0
-1
-1 0.5
Fractional polynomials
Second degree
Deviance Powers Deviance
q Difference
Difference p
1 12.97
-1
26.22*
2 7.80
-1
24.43
3 2.53
-1
22.80
-0.5 -0.5 17.97
20.72
-0.5 0 16.00
18.23
-0.5 0.5 13.93
15.38
-0.5 1 11.77
8.85
-0.5 2 7.39
1.63
-0.5 3 3.10
21.62
0 14.24
0
19.78
0.5 12.43
0
17.69
1 10.61
0
15.41
Powers
q
p
2
0
3
0
0.5 0.5
1
0.5
2
0.5
3
0.5
1
1
2
1
3
1
2
2
3
2
3
3
Deviance
difference
7.05
3.74
10.94
9.51
6.80
4.41
8.46
6.61
5.11
6.44
6.45
7.59
26
Similar fit of several functions
27
Presentation of models for continuous
covariates
• The function + 95% CI gives the whole story
• Functions for important covariates should always be
plotted
• In epidemiology, sometimes useful to give a more
conventional table of results in categories
• This can be done from the fitted function
28
Whitehall 1
Systolic blood pressure
Odds ratio from final FP(2) model
LogOR= 2.92 – 5.43X-2 –14.30* X –2 log X
Presented in categories
Systolic blood pressure (mm
Hg)
Range
ref. point
 90
88
91-100
95
101-110
105
111-120
115
121-130
125
131-140
135
141-160
150
161-180
170
181-200
190
201-240
220
241-280
250
Number of men
at risk
dying
OR (model-based)
Estimate
95%CI
27
283
1079
2668
3456
4197
2775
1437
438
154
5
2.47
1.42
1.00
0.94
1.04
1.25
1.77
2.87
4.54
8.24
15.42
3
22
84
164
289
470
344
252
108
41
4
1.75, 3.49
1.21, 1.67
0.86, 1.03
0.91, 1.19
1.07, 1.46
1.50, 2.08
2.42, 3.41
3.78, 5.46
6.60, 10.28
11.64, 20.43
29
Whitehall 1
MFP analysis
Covariate
Age
Cigarettes
Systolic BP
Total cholesterol
Height
Weight
Job grade
FP etc.
Linear
0.5
-1, -0.5
Linear
Linear
-2, 3
In
No variables were eliminated by the MFP algorithm
Assuming a linear function weight is eliminated by
backward elimination
30
Interactions
Motivation – I
Detecting predictive factors (interaction with treatment)
• Don’t investigate effects in separate subgroups!
• Investigation of treatment/covariate interaction requires
statistical tests
• Care is needed to avoid over-interpretation
• Distinguish two cases:
- Hypothesis generation: searching several interactions
- Specific predefined hypothesis
• For current bad practise - see Assmann et al (Lancet
2000)
31
Motivation - II
Continuous by continuous interactions
• usually linear by linear product term
• not sensible if main effect (prognostic effect) is nonlinear
• mismodelling the main effect may introduce spurious
interactions
32
Detecting predictive factors
(treatment – covariate interaction)
• Most popular approach
- Treatment effect in separate subgroups
- Has several problems (Assman et al 2000)
• Test of treatment/covariate interaction required
- For `binary`covariate standard test for interaction
available
• Continuous covariate
- Often categorized into two groups
33
Categorizing a continuous covariate
• How many cutpoints?
• Position of the cutpoint(s)
• Loss of information  loss of power
34
Standard approach
• Based on binary predictor
• Need cut-point for continuous predictor
• Illustration - problem with cut-point approach
TAM*ER interaction in breast cancer (GBSG-study)
P -value (log scale)
.5
.2
.1
.05
.02
.01
0
5
10
ER cutpoint, fmol/l
15
20
35
Treatment effect by subgroup
Lo g h azard ra tio for tre atm e nt
Low ER
High ER
1
.5
0
-.5
0
5
10
ER cutpoint, fmol/l
15
20
36
New approaches for continuous covariates
• STEPP
Subpopulation treatment effect pattern plots
Bonetti & Gelber 2000
• MFPI
Multivariable fractional polynomial
interaction approach
Royston & Sauerbrei 2004
37
STEPP
Contin. covariate
Sequences of overlapping subpopulations
Sliding window
Tail oriented
2g-1 subpopulations
(here g=8)
38
STEPP
Estimates in subpopulations
• No interaction 
treatment effects ‚similar‘ in all subpopulations
• Plot effects in subpopulations
39
STEPP
Overlapping populations, therefore correlation
between treatment effects in subpopulations
Simultaneous confidence band and tests
proposed
40
MFPI
• Have one continuous factor X of interest
• Use other prognostic factors to build an adjustment
model, e.g. by MFP
Interaction part – with or without adjustment
• Find best FP2 transformation of X with same powers in
each treatment group
• LRT of equality of reg coefficients
• Test against main effects model(no interaction) based
on 2 with 2df
• Distinguish
predefined hypothesis - hypothesis searching
41
RCT: Metastatic renal carcinoma
Comparison of MPA with interferon
1 .00
N = 347, 322 Death
(1) MPA
0 .00
0 .25
0 .50
0 .75
(2) Interferon
At risk 1:
175
55
22
11
3
2
1
At risk 2:
172
73
36
20
8
5
1
0
12
24
36
48
60
72
Follow-up (months)
42
Overall: Interferon is better (p<0.01)
• Is the treatment effect similar in all patients?
Sensible questions?
- Yes, from our point of view
• Ten factors available for the investigation of
treatment – covariate interactions
43
MFPI
Treatment effect function for WCC
-4
-2
0
2
Original data
5
10
White cell count
15
20
Only a result of complex (mis-)modelling?
44
Does the MFPI model agree with the data?
Check proposed trend
Treatment effect in subgroups defined by WCC
0
12
24
36
48
60
72
12
24
36
0
12
24
48
Follow-up (months)
60
72
36
48
60
72
Group IV
0.00 0.25 0.50 0.75 1.00
0.00 0.25 0.50 0.75 1.00
Group III
0
Group II
0.00 0.25 0.50 0.75 1.00
0.00 0.25 0.50 0.75 1.00
Group I
0
12
24
36
48
60
72
Follow-up (months)
HR (Interferon to MPA; adjusted values similar) overall: 0.75 (0.60 – 0.93)
I : 0.53 (0.34 – 0.83)
II : 0.69 (0.44 – 1.07)
III : 0.89 (0.57 – 1.37)
IV : 1.32 (0.85 –2.05)
45
STEPP – Interaction with WCC
SLIDING WINDOW (n1 = 25, n2 = 40)
TAIL ORIENTED (g = 8)
46
1
.5
.2
.1
Relative hazard
2
STEPP as check of MFPI
4
8
12
16
20
White cell count
STEPP – tail-oriented, g = 6
47
MFPI – Type I error
0
.5
D en sity
1
1 .5
Random permutation of a continuous covariate
(haemoglobin)
 no interaction
Distribution of P-value from test of interaction
1000 runs, Type I error: 0.054
0
.2
.4
.6
.8
1
p
48
Continuous by continuous interactions
MFPIgen
• Have Z1 , Z2 continuous and X confounders
• Apply MFP to X, Z1 and Z2, forcing Z1 and Z2 into the model. FP
functions f1(Z1) and f2(Z2) will be selected for Z1 and Z2
• Add term f1(Z1)* f2(Z2) to the model chosen and use LRT for test
of interaction
• Often f1(Z1) and/or f2(Z2) are linear
• Check all pairs of continuous variables for an interaction
• Check (graphically) interactions for artefacts
• Use forward stepwise if more than one interaction remains
• Low significance level for interactions
49
Interactions
Whitehall 1
Consider only age and weight
Main effects:
age – linear
weight – FP2 (-1,3)
Interaction?
Include age*weight-1 + age*weight3
into the model
LRT: χ2 = 5.27 (2df, p = 0.07)
 no (strong) interaction
50
Erroneously assume that the effect of weight is linear
Interaction?
Include age*weight into the model
LRT: χ2 = 8.74 (1df, p = 0.003)
 hightly significant interaction
51
Model check:
categorize age in 4 equal sized groups
Compute running line smooth of the binary outcome
on weight in each group
52
0
Whitehall 1: check of age x weight interaction
2nd quartile
4th quartile
-4
-3
-2
-1
1st quartile
3rd quartile
40
60
80
100
120
140
weight
53
Running line smooth are about
parallel across age groups 
no (strong) interactions
smoothed probabilities are about equally spaced 
effect of age is linear
54
Erroneously assume that the effect of
weight is linear
Estimated slopes of weight in age-groups indicates
strong qualitative interaction between age und weight
55
Whitehall 1:
P-values for two-way interactions from MFPIgen
*FP transformations
 chol*age highly significant
56
Presentation of interactions
Whitehall 1: age*chol interaction
Effect (adjusted) for 10th, 35th, 65th and 90th centile
57
Age*Chol interaction
Chol ‚low‘ : age has an effect
Chol ‚high‘ : age has no effect
Age ‚low‘ : chol has an effect
Age ‚high‘ : chol has no effect
58
Age*Chol interaction
Does the model fit? Check in 4 subgroups
Linearity of chol: ok
But: Slopes are not monotonically ordered
Lack of fit of linear*linear interaction
59
More complicated model?
Interaction ‚real‘?
Validation in new data!
60
Survival data
effect of a covariate may vary in time 
time by covariate interaction
61
Extending the Cox model
• Cox model
(t | X) = 0(t) exp (X)
• Relax PH-assumption
dynamic Cox model
(t | X) = 0(t) exp ((t) X)
HR(x,t) – function of X and time t
•
Relax linearity assumption
(t | X) = 0(t) exp ( f (X))
62
Causes of non-proportionality
• Effect gets weaker with time
• Incorrect modelling
–
–
–
omission of an important covariate
incorrect functional form of a covariate
different survival model is appropriate
63
Non-PH – What can be done?
Non-PH
- Does it matter ?
- Is it real ?
Non-PH is large and real
–
–
–
stratify by the factor
(t|X, V=j) = j (t) exp (X )
• effect of V not estimated, not tested
• for continuous variables grouping necessary
Partition time axis
Model non-proportionality by time-dependent
covariate
64
Example: Time-varying effects
Rotterdam breast cancer data
2982 patients
1 to 231 months follow-up time
1518 events for RFI (recurrence free interval)
Adjuvant treatment with chemo- or hormonal therapy according to clinic
guidelines
70% without adjuvant treatment
Covariates
continuous
age, number of positive nodes, estrogen, progesterone
categorical
menopausal status, tumor size, grade
65
• Treatment variables ( chemo , hormon) will be
analysed as usual covariates
• 9 covariates , partly strong correlation
(age-meno; estrogen-progesterone; chemo,
hormon – nodes )
 variable selection
• Use multivariable fractional polynomial approach
for model selection in the Cox proportional
hazards model
66
Assessing PH-assumption
• Plots
– Plots of log(-log(S(t))) vs log t should be parallel for groups
– Plotting Schoenfeld residuals against time to identify patterns in
regression coefficients
– Many other plots proposed
• Tests
many proposed, often based on Schoenfeld residuals,
most differ only in choice of time transformation
• Partition the time axis and fit models seperatly to each
time interval
• Including time-by-covariate interaction terms in the
model and estimate the log hazard ratio function
67
Smoothed Schoenfeld residuals –
multivariable MFP model assuming PH
68
Selected model with MFP
estimates
Factor
test of time-varying effect for
different time transformations
t
p-value
rank(t)
Log(t)
Sqrt(t)

SE
X1 – age
-0.01
0.002
0.082
0.243
0.329
0.149
X3a – size
0.29
0.057
0.000
0.000
0.001
0.000
X4b – grade
0.39
0.064
0.189
0.198
0.129
0.164
X5e – nodes
-1.71
0.081
0.002
0.000
0.000
0.000
X8 - chemo-T
-0.39
0.085
0.091
0.008
0.023
0.034
X9 – horm-T
-0.45
0.073
0.014
0.001
0.000
0.002
Index
1.00
0.039
0.000
0.000
0.000
0.000
69
Including time – by covariate interaction
(Semi-) parametric models for β(t)
• model (t) x =  x +  x g(t)
calculate time-varying covariate x g(t)
fit time-varying Cox model and test for 
plot (t) against t
•

0
g(t) – which form?
–
–
–
–
‘usual‘ function, eg t, log(t)
piecewise
splines
fractional polynomials
70
MFPTime algorithm
Motivation
• Multivariable strategy required to select
– Variables which have influence on outcome
– For continuous variables determine functional form of
the influence (‚usual‘ linearity assumption sensible?)
– Proportional hazards assumption sensible or does a
time-varying function fit the data better?
71
MFPTime algorithm (1)
• Determine (time-fixed) MFP model M0
possible problems
variable included, but effect is not
constant in time
variable not included because of
short term effect only
• Consider short term period only
Additional to M0 significant variables?
This gives M1
72
MFPTime algorithm (2)
For all variables (with transformations) selected from full time-period
and short time-period
• Investigate time function for each covariate in forward stepwise
fashion - may use small P value
• Adjust for covariates from selected model
• To determine time function for a variable
compare deviance of models ( 2) from
FPT2 to null (time fixed effect)
4 DF
FPT2 to log
3 DF
FPT2 to FPT1
2 DF
• Use strategy analogous to stepwise to add
time-varying functions to MFP model M1
73
Development of the model
Variable
X1
X3b
Model M0
Model M1
Model M2
β
SE
β
SE
β
SE
-0.013
0.002
-0.013
0.002
-0.013
0.002
0.171
0.080
0.150
0.081
-
-
X4
0.39
0.064
0.354
0.065
0.375
0.065
X5e(2)
-1.71
0.081
-1.681
0.083
-1.696
0.084
X8
-0.39
0.085
-0.389
0.085
-0.411
0.085
X9
-0.45
0.073
-0.443
0.073
-0.446
0.073
X3a
0.29
0.057
0.249
0.059
- 0.112
0.107
-0.032
0.012
- 0.137
0.024
logX6
-
-
X3a(log(t))
-
-
-
-
- 0.298
0.073
logX6(log(t))
-
-
-
-
0.128
0.016
0.504
0.082
-0.361
0.052
Index
Index(log(t))
1.000
-
0.039
-
1.000
-
0.038
-
74
Time-varying effects in final model
75
Final model includes time-varying
functions for
progesterone ( log(t) )
and
tumor size ( log(t) )
Prognostic ability of the Index vanishes in
time
76
Software sources MFP
• Most comprehensive implementation is in Stata
– Command mfp is part since Stata 8 (now Stata 10)
• Versions for SAS and R are available
– SAS
www.imbi.uni-freiburg.de/biom/mfp
– R version available on CRAN archive
• mfp package
• Extensions to investigate interactions
• So far only in Stata
77
Concluding comments – MFP
• FPs use full information - in contrast to a priori
categorisation
• FPs search within flexible class of functions (FP1 and
FP(2)-44 models)
• MFP is a well-defined multivariate model-building
strategy – combines search for transformations with
BE
• Important that model reflects medical knowledge,
e.g. monotonic / asymptotic functional forms
78
Towards recommendations for model-building by selection of
variables and functional forms for continuous predictors under
several assumptions
Issue
Recommendation
Variable selection procedure
Backward elimination; significance level as key
tuning parameter, choice depends on the aim of the
study
Functional form for continuous
covariates
Linear function as the 'default', check improvement
in model fit by fractional polynomials. Check
derived function for undetected local features
Extreme values or influential points
Check at least univariately for outliers and
influential points in continuous variables. A
preliminary transformation may improve the model
selected. For a proposal see R & S 2007
Sensitivity analysis
Important assumptions should be checked by a
sensitivity analysis. Highly context dependent
Check of model stability
The bootstrap is a suitable approach to check for
model stability
Complexity of a predictor
A predictor should be 'as parsimonious as possible'
Sauerbrei et al. SiM 2007
79
Interactions
• Interactions are often ignored by analysts
• Continuous  categorical has been studied in FP
context because clinically very important
• Continuous  continuous is more complex
• Interaction with time important for long-term FU
survival data
80
MFP extensions
• MFPI – treatment/covariate interactions
In contrast to STEPP it avoids categorisation
• MFPIgen – interaction between two continuous
variables
• MFPT – time-varying effects in survival data
81
Summary
Getting the big picture right is more important
than optimising aspects and ignoring others
• strong predictors
• strong non-linearity
• strong interactions
• strong non-PH in survival model
82
References
Harrell FE jr. (2001): Regression Modeling Strategies. Springer.
Royston P, Altman DG. (1994): Regression using fractional polynomials of continuous covariates: parsimonious parametric
modelling (with discussion). Applied Statistics, 43, 429-467.
Royston P, Altman DG, Sauerbrei W (2006): Dichotomizing continuous predictors in multiple regression: a bad idea.
Statistics in Medicine, 25, 127-141.
Royston P, Sauerbrei W. (2004): A new approach to modelling interactions between treatment and continuous covariates in
clinical trials by using fractional polynomials. Statistics in Medicine, 23, 2509-2525.
Royston P, Sauerbrei W. (2005): Building multivariable regression models with continuous covariates, with a practical
emphasis on fractional polynomials and applications in clinical epidemiology. Methods of Information in Medicine, 44, 561571.
Royston P, Sauerbrei W. (2007): Improving the robustness of fractional polynomial models by preliminary covariate
transformation: a pragmatic approach. Computational Statistics and Data Analysis, 51: 4240-4253.
Royston P, Sauerbrei W (2008): Multivariable Model-Building - A pragmatic approach to regression analysis based on
fractional polynomials for continuous variables. Wiley.
Sauerbrei W. (1999): The use of resampling methods to simplify regression models in medical statistics. Applied Statistics, 48,
313-329.
Sauerbrei W, Meier-Hirmer C, Benner A, Royston P. (2006): Multivariable regression model building by using fractional
polynomials: Description of SAS, STATA and R programs. Computational Statistics & Data Analysis, 50, 3464-3485.
Sauerbrei W, Royston P. (1999): Building multivariable prognostic and diagnostic models: transformation of the predictors
by using fractional polynomials. Journal of the Royal Statistical Society A, 162, 71-94.
Sauerbrei W, Royston P, Binder H (2007): Selection of important variables and determination of functional form for
continuous predictors in multivariable model building. Statistics in Medicine, 26:5512-28.
Sauerbrei W, Royston P, Look M. (2007): A new proposal for multivariable modelling of time-varying effects in survival data
based on fractional polynomial time-transformation. Biometrical Journal, 49: 453-473.
Sauerbrei W, Royston P, Zapien K. (2007): Detecting an interaction between treatment and a continuous covariate: a
comparison of two approaches. Computational Statistics and Data Analysis, 51: 4054-4063.
Schumacher M, Holländer N, Schwarzer G, Sauerbrei W. (2006): Prognostic Factor Studies. In Crowley J, Ankerst DP (ed.),
Handbook of Statistics in Clinical Oncology, Chapman&Hall/CRC, 289-333.
83
Download