Title: | Companion Package to "Dynamic Prediction in Clinical Survival Analysis" |
---|---|
Description: | The dynpred package contains functions for dynamic prediction in survival analysis. |
Authors: | Hein Putter |
Maintainer: | Hein Putter <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.2 |
Built: | 2024-11-07 03:41:08 UTC |
Source: | https://github.com/cran/dynpred |
The companion package of the book "Dynamic Prediction in Survival Analysis".
Package: | dynpred |
Type: | Package |
Version: | 0.1.2 |
Date: | 2014-11-10 |
License: | GPL (>= 2) |
An overview of how to use the package, including the most important functions.
Hein Putter Maintainer: Hein Putter <[email protected]>
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
Calculate model-free curve of Area Under the Curve values over time, based on the dynamic/incident AUC of Heagerty and Zheng.
AUC(formula, data, plot = TRUE)
AUC(formula, data, plot = TRUE)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
plot |
Determines whether the AUC function should be plotted (if
|
A list with elements
AUCt |
A data frame with time t in column
|
AUC |
The AUC(t) weighted by Y(t)-1, with Y(t) the number at risk at t; this coincides with Harrell's c-index |
Hein Putter [email protected]
Harrell FE, Lee KL & Mark DB (1996), Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Statistics in Medicine 15, 361-387.
Heagerty PJ & Zheng Y (2005), Survival model predictive accuracy and ROC curves, Biometrics 61, 92-105.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) AUC(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
data(ova) AUC(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
Calculate dynamic model-free curve of Area Under the Curve values over time, based on the dynamic/incident AUC of Heagerty and Zheng.
AUCw(formula, data, width)
AUCw(formula, data, width)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
width |
Width of the window |
A data frame with columns
time |
The time points t at which AUCw(t) changes value (either t or t+width is an event time point) |
AUCw |
The AUCw(t) function |
and with attribute "width"
given as
input.
Hein Putter [email protected]
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) AUCw(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova, width = 2)
data(ova) AUCw(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova, width = 2)
This function calculates Harrell's c-index.
cindex(formula, data)
cindex(formula, data)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
A list with elements
concordant |
The number of concordant pairs |
total |
The total number of pairs that can be evaluated |
cindex |
Harrell's c-index |
Hein Putter [email protected]
Harrell FE, Lee KL & Mark DB (1996), Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Statistics in Medicine 15, 361-387.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) cindex(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
data(ova) cindex(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
Create landmark data set from original data, which can be either in wide or long format, see details.
cutLM(data, outcome, LM, horizon, covs, format = c("wide", "long"), id, rtime, right = TRUE)
cutLM(data, outcome, LM, horizon, covs, format = c("wide", "long"), id, rtime, right = TRUE)
data |
Data frame from which to construct landmark dataset |
outcome |
List with items |
LM |
Scalar, the value of the landmark time point |
horizon |
Scalar, the value of the horizon. Administrative censoring is
applied at |
covs |
List with items |
format |
Character string specifying whether the original data are in wide (default) or in long format |
id |
Character string specifying the column name in |
rtime |
Character string specifying the column name in |
right |
Boolean (default= |
For a given landmark time point LM
, patients who have reached the
event of interest (outcome) or are censored before or at LM
are
removed. Administrative censoring is applied at the time horizon.
Time-varying covariates are evaluated at the landmark time point LM
.
Time-varying covariates can be specified in the varying
item of the
covs
argument, in two ways. In the first way (data in long format)
different values of time-dependent covariate(s) are stored different rows of
the data, with id
identifying which values belong to the same
subject; the column specified through rtime
then contains the time
points at which the value of the covariate changes value; with
right=TRUE
(default), it is assumed that the covariate changes value
at the time point specified in rtime
(and hence is not used for
prediction of an event at rtime
), while with right=FALSE
, it
is assumed that the covariate changes value just before the time point
specified in rtime
. The second way (data in wide format) can only be
used for a specific type of time-varying covariates, often used to model
whether some other event has occurred or not, namely those that change value
from 0 (event not yet occurred) to 1 (event has occurred).
A landmark data set, containing the outcome and the values of
time-fixed and time-varying covariates taken at the landmark time points.
The value of the landmark time point is stored in column LM
.
Hein Putter [email protected]
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
test0 <- data.frame(id=c(1,1,1,2,2,2),survyrs=c(2.3,2.3,2.3,2.7,2.7,2.7), survstat=c(1,1,1,0,0,0),age=c(76,76,76,68,68,68),gender=c(1,1,1,2,2,2), bp=c(80,84,88,92,90,89),bptime=c(1,2,2.2,0,1,2)) cutLM(test0, outcome=list(time="survyrs", status="survstat"), LM=1, horizon=2.5, covs=list(fixed=c("age","gender"),varying="bp"), format="long", id="id", rtime="bptime") # Note how the previous example does not use the value of the time-varying # covariate AT time=LM, only just before (if available). This is in line # with the time-varying covariates being predictable. # If you want the value of the time-varying covariate at time=LM if it # changes value at LM, then use right=FALSE cutLM(test0, outcome=list(time="survyrs", status="survstat"), LM=1, horizon=2.5, covs=list(fixed=c("age","gender"),varying="bp"), format="long", id="id", rtime="bptime", right=FALSE) # An example of a time-varying covariate in wide format; recyrs and recstat # are time and status of a (cancer) recurrence. Here it is assumed that the # value of the time-varying covariate is 0 and changes value to 1 at recyrs. # The status variable is not used! test1 <- data.frame(id=1:4,survyrs=c(7.6,8.4,5.3,2.6),survstat=c(0,1,1,0), age=c(48,52,76,18),gender=c(1,2,2,1),recyrs=c(7.6,5.2,0.8,2.6), recstat=c(0,1,1,0)) cutLM(test1, outcome=list(time="survyrs", status="survstat"), LM=3, horizon=8, covs=list(fixed=c("id","age","gender"),varying="recyrs")) # The same example in long format, similar to (but not the same as) the way # one would use a time-varying covariate in long format. test2 <- data.frame(id=c(1,2,2,3,3,4),survyrs=c(7.6,8.4,8.4,5.3,5.3,2.6), survstat=c(0,1,1,1,1,0),age=c(48,52,52,76,76,18),gender=c(1,2,2,2,2,1), rec=c(0,0,1,0,1,0),rectime=c(0,0,5.2,0,0.8,0)) cutLM(test2, outcome=list(time="survyrs", status="survstat"), LM=3, horizon=8, covs=list(fixed=c("age","gender"),varying="rec"), format="long", id="id", rtime="rectime")
test0 <- data.frame(id=c(1,1,1,2,2,2),survyrs=c(2.3,2.3,2.3,2.7,2.7,2.7), survstat=c(1,1,1,0,0,0),age=c(76,76,76,68,68,68),gender=c(1,1,1,2,2,2), bp=c(80,84,88,92,90,89),bptime=c(1,2,2.2,0,1,2)) cutLM(test0, outcome=list(time="survyrs", status="survstat"), LM=1, horizon=2.5, covs=list(fixed=c("age","gender"),varying="bp"), format="long", id="id", rtime="bptime") # Note how the previous example does not use the value of the time-varying # covariate AT time=LM, only just before (if available). This is in line # with the time-varying covariates being predictable. # If you want the value of the time-varying covariate at time=LM if it # changes value at LM, then use right=FALSE cutLM(test0, outcome=list(time="survyrs", status="survstat"), LM=1, horizon=2.5, covs=list(fixed=c("age","gender"),varying="bp"), format="long", id="id", rtime="bptime", right=FALSE) # An example of a time-varying covariate in wide format; recyrs and recstat # are time and status of a (cancer) recurrence. Here it is assumed that the # value of the time-varying covariate is 0 and changes value to 1 at recyrs. # The status variable is not used! test1 <- data.frame(id=1:4,survyrs=c(7.6,8.4,5.3,2.6),survstat=c(0,1,1,0), age=c(48,52,76,18),gender=c(1,2,2,1),recyrs=c(7.6,5.2,0.8,2.6), recstat=c(0,1,1,0)) cutLM(test1, outcome=list(time="survyrs", status="survstat"), LM=3, horizon=8, covs=list(fixed=c("id","age","gender"),varying="recyrs")) # The same example in long format, similar to (but not the same as) the way # one would use a time-varying covariate in long format. test2 <- data.frame(id=c(1,2,2,3,3,4),survyrs=c(7.6,8.4,8.4,5.3,5.3,2.6), survstat=c(0,1,1,1,1,0),age=c(48,52,52,76,76,18),gender=c(1,2,2,2,2,1), rec=c(0,0,1,0,1,0),rectime=c(0,0,5.2,0,0.8,0)) cutLM(test2, outcome=list(time="survyrs", status="survstat"), LM=3, horizon=8, covs=list(fixed=c("age","gender"),varying="rec"), format="long", id="id", rtime="rectime")
This function calculates cross-validated versions of Harrell's c-index.
CVcindex(formula, data, type = "single", matrix = FALSE)
CVcindex(formula, data, type = "single", matrix = FALSE)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
type |
One of |
matrix |
if |
A list with elements
concordant |
The number of concordant pairs |
total |
The total number of pairs that can be evaluated |
cindex |
The cross-validated c-index |
matrix |
Matrix of
cross-validated prognostic indices (only if argument |
and with attribute "type"
as given as input.
Hein Putter [email protected]
Harrell FE, Lee KL & Mark DB (1996), Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Statistics in Medicine 15, 361-387.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) # Real thing takes a long time, so on a smaller data set ova2 <- ova[1:100,] # Actual c-index cindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2) # Cross-validated c-indices CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2) CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2, type="pair") CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2, type="fullpairs")
data(ova) # Real thing takes a long time, so on a smaller data set ova2 <- ova[1:100,] # Actual c-index cindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2) # Cross-validated c-indices CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2) CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2, type="pair") CVcindex(Surv(tyears,d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova2, type="fullpairs")
This function calculates the cross-validated log partial likelihood, with shrinkage if requested.
CVPL(formula, data, progress = TRUE, overall = FALSE, shrink = 1)
CVPL(formula, data, progress = TRUE, overall = FALSE, shrink = 1)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
progress |
if |
overall |
if |
shrink |
Shrinkage factor; default is 1 (no shrinkage) |
Numeric; the cross-validated log partial likelihood
Hein Putter [email protected]
Verweij PJM & van Houwelingen HC (1994), Penalized likelihood in Cox regression, Statistics in Medicine 13, 2427-2436.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) CVPL(Surv(tyears, d) ~ 1, data = ova) CVPL(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova) CVPL(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova, overall=TRUE)
data(ova) CVPL(Surv(tyears, d) ~ 1, data = ova) CVPL(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova) CVPL(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova, overall=TRUE)
A simple but useful debugging function. It first announces the object to printed and then prints it.
deb(x, method = c("print", "cat"))
deb(x, method = c("print", "cat"))
x |
The object to be printed |
method |
The method for printing |
Hein Putter [email protected]
tm <- c(0.2,0.5,1,1.2,1.8,4) ta <- 2*tm dfr <- data.frame(time=tm, stepf=ta) deb(dfr, method="print") deb(tm, method="cat")
tm <- c(0.2,0.5,1,1.2,1.8,4) ta <- 2*tm dfr <- data.frame(time=tm, stepf=ta) deb(dfr, method="print") deb(tm, method="cat")
Data from the European Society for Blood and Marrow Transplantation (EBMT)
A data frame of 2279 patients transplanted at the EBMT between 1985 and 1998. These data were used in Fiocco, Putter & van Houwelingen (2008) and van Houwelingen & Putter (2008). The included variables are
Patient identification number
Time in days from transplantation to recovery or last follow-up
Recovery status; 1 = recovery, 0 = censored
Time in days from transplantation to adverse event (AE) or last follow-up
Adverse event status; 1 = adverse event, 0 = censored
Time in days from transplantation to both recovery and AE or last follow-up
Recovery and AE status; 1 = both recovery and AE, 0 = no recovery or no AE or censored
Time in days from transplantation to relapse or last follow-up
Relapse status; 1 = relapse, 0 = censored
Time in days from transplantation to death or last follow-up
Relapse status; 1 = dead, 0 = censored
Year of transplantation; factor with levels "1985-1989", "1990-1994", "1995-1998"
Patient age at transplant; factor with levels "<=20", "20-40", ">40"
Prophylaxis; factor with levels "no", "yes"
Donor-recipient gender match; factor with levels "no gender mismatch", "gender mismatch"
We gratefully acknowledge the European Society for Blood and Marrow Transplantation (EBMT) for making available these data. Disclaimer: these data were simplified for the purpose of illustration of the analysis of competing risks and multi-state models and do not reflect any real life situation. No clinical conclusions should be drawn from these data.
Fiocco M, Putter H, van Houwelingen HC (2008). Reduced-rank proportional hazards regression and simulation-based prediction for multi-state models. Statistics in Medicine 27, 4340–4358.
van Houwelingen HC, Putter H (2008). Dynamic predicting by landmarking as an alternative for multi-state modeling: an application to acute lymphoid leukemia data. Lifetime Data Anal 14, 447–463.
Given one or more right-continuous step functions of time, given by vector
time
and vector of matrix stepf
, this function evaluates the
step function(s) at a vector of new time points given by newtime
.
Typical application is when the step function is given by a non- or
semi-parametric estimated of cumulative hazard or survival function, and the
value of this function is required at a set of time points.
evalstep(time, stepf, newtime, subst = -Inf, to.data.frame = FALSE)
evalstep(time, stepf, newtime, subst = -Inf, to.data.frame = FALSE)
time |
A vector of time points at which the step function changes value |
stepf |
A vector (of the same length as |
newtime |
A vector of time points at which the step function(s) is/are to be evaluated |
subst |
A value that is substituted for elements of |
to.data.frame |
Determines whether the output is a data frame with the
new time points and the values of the step function(s) (if |
The argument time
should be ordered, and not contain duplicated or
+/- Inf, and should be of the same length as stepf
. There are no
restrictions on ordering or duplicates of newtime
. For elements of
newtime
that are smaller than the minimum of time
, the value
of subst
is substituted.
Either a vector/matrix containing the step function(s) evaluated at
the new time points (if to.data.frame=FALSE
(default)), or a data
frame with column vectors newtime
containing the new time points and
res
containing the step function evaluated at the new time points (if
to.data.frame=TRUE
)
Hein Putter [email protected]
tm <- c(0.2,0.5,1,1.2,1.8,4) ta <- 2*tm data.frame(time=tm, stepf=ta) evalstep(time=tm, stepf=ta, newtime=c(0,0.2,0.3,0.6,1,1.5,3,4,5,0.1), subst=0) evalstep(time=tm, stepf=data.frame(ta=ta,ta2=1/ta), newtime=c(0,0.2,0.3,0.6,1,1.5,3,4,5,0.1), subst=0)
tm <- c(0.2,0.5,1,1.2,1.8,4) ta <- 2*tm data.frame(time=tm, stepf=ta) evalstep(time=tm, stepf=ta, newtime=c(0,0.2,0.3,0.6,1,1.5,3,4,5,0.1), subst=0) evalstep(time=tm, stepf=data.frame(ta=ta,ta2=1/ta), newtime=c(0,0.2,0.3,0.6,1,1.5,3,4,5,0.1), subst=0)
Calculate dynamic "death within window" curve, in other words, one minus fixed width conditional survival curves, defined as P(T<=t+w|T>t), for a fixed window width w.
Fwindow(object, width, variance = TRUE, conf.level = 0.95)
Fwindow(object, width, variance = TRUE, conf.level = 0.95)
object |
|
width |
Width of the window |
variance |
Boolean (default= |
conf.level |
The confidence level, between 0 and 1 (default=0.95) |
"Die within window function" with window w, Fw(t) = P(T<=t+w|T>t), evaluated
at all time points t where the estimate changes value, and associated
pointwise confidence intervals (if variance
=TRUE
).
Both estimate and pointwise lower and upper confidence intervals are based on the negative exponential of the Nelson-Aalen estimate of the cumulative hazard, so P(T<=t+w|T>t) is estimated as exp(- int_t^t+w hatH_NA(s) ds), with hatH_NA the non-parametric Nelson-Aalen estimate.
Note: in object
, no event time points at or below zero allowed
A data frame with columns
time |
The time points t at which Fw(t) changes value (either t or t+width is an event time point) |
Fw |
The Fw(t) function |
low |
Lower end of confidence interval |
up |
Upper end of confidence interval |
and with attribute
"width"
as given as input.
Hein Putter [email protected]
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(wbc1) c0 <- coxph(Surv(tyears, d) ~ 1, data = wbc1, method="breslow") sf0 <- survfit(c0) Fw <- Fwindow(sf0,4)
data(wbc1) c0 <- coxph(Surv(tyears, d) ~ 1, data = wbc1, method="breslow") sf0 <- survfit(c0) Fw <- Fwindow(sf0,4)
A data frame of 295 patients with breast cancer. The included variables are
Patient identification number
Survival status; 1 = death; 0 = censored
Time in years until death or last follow-up
Diameter of the primary tumor
Number of positive lymph nodes
Age of the patient
Estrogen level?
Chemotherapy used (yes/no)
Hormonal therapy used (yes/no)
Type of surgery (excision or mastectomy)
Histological grade (Intermediate, poorly, or well differentiated)
Vascular invasion (-, +, or +/-)
??
Estrogen level?
A data frame, see data.frame
.
van't Veer LJ, Dai HY, van de Vijver MJ, He YDD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R \& Friend SH (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536.
van de Vijver MJ, He YD, van 't Veer LJ, Dai H, Hart AAM, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH \& Bernards R (2002). A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine 347, 1999–2009.
van Houwelingen HC, Bruinsma T, Hart AAM, van't Veer LJ \& Wessels LFA (2006). Cross-validated Cox regression on microarray gene expression data. Statistics in Medicine 25, 3201–3216.
A data frame of 358 patients with ovarian cancer. The included variables are
Time in years until death or last follow-up
Survival status; 1 = death; 0 = censored
Karnofsky score
Broders score: factor with levels "unknown", "1", "2", "3", "4"
FIGO stage; factor with levels "III", "IV"
Presence of ascires; factor with levels "unknown", "absent", "present"
Diameter of the tumor; factor with levels "micr.", "<1cm", "1-2cm", "2-5cm", ">5cm"
A data frame, see data.frame
.
Neijt, J. P., ten Bokkel Huinink, W. W., van der Burg, M. E., van Oosterom, A. T., Vriesendorp, R., Kooyman, C. D., van Lindert, A. C., Hamerlynck, J. V., van Lent, M. & van Houwelingen, J. C. (1984), 'Randomised trial comparing two combination chemotherapy regimens (Hexa-CAF vs CHAP- 5) in advanced ovarian carcinoma', Lancet 2, 594–600.
Neijt, J. P., ten Bokkel Huinink, W. W., van der Burg, M. E., van Oosterom, A. T., Willemse, P. H., Heintz, A. P., van Lent, M., Trimbos, J. B., Bouma, J. & Vermorken, J. B. (1987), 'Randomized trial comparing two combination chemotherapy regimens (CHAP-5 vs CP) in advanced ovarian carcinoma', Journal of Clinical Oncology 5, 1157–1168.
van Houwelingen, J. C., ten Bokkel Huinink, W. W., van der Burg, M. E., van Oosterom, A. T. & Neijt, J. P. (1989), 'Predictability of the survival of patients with advanced ovarian cancer.', Journal of Clinical Oncology 7, 769–773.
Calculate prediction error curve.
pe(time, status, tsurv, survmat, tcens, censmat, FUN = c("KL", "Brier"), tout) pecox(formula, censformula, data, censdata, FUN = c("KL", "Brier"), tout, CV = FALSE, progress = FALSE)
pe(time, status, tsurv, survmat, tcens, censmat, FUN = c("KL", "Brier"), tout) pecox(formula, censformula, data, censdata, FUN = c("KL", "Brier"), tout, CV = FALSE, progress = FALSE)
time |
Vector of time points in data |
status |
Vector of event indicators in data |
tsurv |
Vector of time points corresponding to the estimated survival
probabilities in |
survmat |
Matrix of estimated survival probabilities; dimension should be length of tsurv x length of time |
tcens |
Vector of time points corresponding to the estimated censoring
probabilities in |
censmat |
Matrix of estimated censoring probabilities; dimension should be length of tcens x length of time |
FUN |
The error function, either |
tout |
Vector of time points at which to evaluate prediction error. If missing, prediction error will be evaluated at all time points where the estimate will change value |
formula |
Formula for prediction model to be used as in
|
censformula |
Formula for censoring model, also to be used as in
|
data |
Data set in which to interpret |
censdata |
Data set in which to interpret |
CV |
Boolean (default= |
progress |
Boolean (default= |
The censformula
is used to calculate inverse probability of censoring
weights (IPCW).
A data frame with columns
time |
Event time points |
Err |
Prediction error of model specified by |
Hein Putter [email protected]
Graf E, Schmoor C, Sauerbrei W & Schumacher M (1999), Assessment and comparison of prognostic classification schemes for survival data, Statistics in Medicine 18, 2529-2545.
Gerds & Schumacher (2006), Consistent estimation of the expected Brier score in general survival models with right-censored event times, Biometrical Journal 48, 1029-1040.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) # Example on a subset, because the effect of CV is clearer ova2 <- ova[1:100,] pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5)) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova, FUN="Brier", tout=seq(0,6,by=0.5)) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE)
data(ova) # Example on a subset, because the effect of CV is clearer ova2 <- ova[1:100,] pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5)) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova, FUN="Brier", tout=seq(0,6,by=0.5)) pecox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, data = ova, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE)
Calculate dynamic fixed width prediction error curve.
pew(time, status, tsurv, survmat, tcens, censmat, width, FUN = c("KL", "Brier"), tout) pewcox(formula, censformula, width, data, censdata, FUN = c("KL", "Brier"), tout, CV = FALSE, progress = FALSE)
pew(time, status, tsurv, survmat, tcens, censmat, width, FUN = c("KL", "Brier"), tout) pewcox(formula, censformula, width, data, censdata, FUN = c("KL", "Brier"), tout, CV = FALSE, progress = FALSE)
time |
Vector of time points in data |
status |
Vector of event indicators in data |
tsurv |
Vector of time points corresponding to the estimated survival
probabilities in |
survmat |
Matrix of estimated survival probabilities; dimension should be length of tsurv x length of time |
tcens |
Vector of time points corresponding to the estimated censoring
probabilities in |
censmat |
Matrix of estimated censoring probabilities; dimension should be length of tcens x length of time |
width |
Width of the window |
FUN |
The error function, either |
tout |
Vector of time points at which to evaluate prediction error. If missing, prediction error will be evaluated at all time points where the estimate will change value |
formula |
Formula for prediction model to be used as in
|
censformula |
Formula for censoring model, also to be used as in
|
data |
Data set in which to interpret |
censdata |
Data set in which to interpret |
CV |
Boolean (default= |
progress |
Boolean (default= |
Corresponds to Equation (3.6) in van Houwelingen and Putter (2011). The
censformula
is used to calculate inverse probability of censoring
weights (IPCW).
A data frame with columns
time |
Event time points |
Err |
Prediction error of model specified by |
and with attribute "width"
given as input.
Hein Putter [email protected]
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) # Example on a subset, because the effect of CV is clearer ova2 <- ova[1:100,] pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5)) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova, FUN="Brier", tout=seq(0,6,by=0.5)) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE)
data(ova) # Example on a subset, because the effect of CV is clearer ova2 <- ova[1:100,] pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5)) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova2, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova, FUN="Brier", tout=seq(0,6,by=0.5)) pewcox(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, Surv(tyears, 1-d) ~ 1, width=2, data = ova, FUN="Brier", tout=seq(0,6,by=0.5), CV=TRUE, progress=TRUE)
Create scatter plot with imputed survival times.
scatterplot(formula, data, horizon, plot = TRUE, xlab)
scatterplot(formula, data, horizon, plot = TRUE, xlab)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
horizon |
The horizon, maximum value to be imputed in case of censored observations; default is 1.05 times largest event time |
plot |
Should the tolerance plot actually be plotted? Default is
|
xlab |
Label for x-axis |
Imputation is used for censored survival times.
A data frame with columns
x |
Predictor (centered at zero) |
imputed |
(Imputed) survival time |
and with attribute "horizon"
(copied from input or default).
Hein Putter [email protected]
Royston P (2001), The lognormal distribution as a model for survival time in cancer, with an emphasis on prognostic factors, Statistica Neerlandica 55, 89-104.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) scatterplot(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
data(ova) scatterplot(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
Create a tolerance plot according to the methods of Henderson, Jones & Stare (2001)
toleranceplot(formula, data, coverage = 0.8, horizon, plot = TRUE, xlab)
toleranceplot(formula, data, coverage = 0.8, horizon, plot = TRUE, xlab)
formula |
Formula for prediction model to be used as in
|
data |
Data set in which to interpret the formula |
coverage |
The coverage for the tolerance intervals (default is 0.8) |
horizon |
The horizon, maximum value to be imputed in case of censored observations; default is 1.05 times largest event time |
plot |
Should the tolerance plot actually be plotted? Default is
|
xlab |
Label for x-axis |
Warnings will be issued each time the survival curve corresponding to a value of x never goes below (1-coverage)/2; these warnings may be ignored.
A data frame with columns
x |
Predictor (centered at zero) |
lower |
Lower bound of tolerance interval |
upper |
Upper bound of tolerance interval |
and with attributes "coverage"
and
"horizon"
(copied from input or default).
Hein Putter [email protected]
Henderson R, Jones M & Stare J (2001), Accuracy of point predictions in survival analysis, Statistics in Medicine 20, 3083-3096.
van Houwelingen HC, Putter H (2012). Dynamic Prediction in Clinical Survival Analysis. Chapman & Hall.
data(ova) toleranceplot(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
data(ova) toleranceplot(Surv(tyears, d) ~ Karn + Broders + FIGO + Ascites + Diam, data = ova)
A data frame of 210 patients with Chronic Myeloid Leukemia from the Benelux
CML study (Kluin-Nelemans et al. 1998). Data have been used in two
methodological papers, de Bruijne et al. (2001) and van Houwelingen (2007),
and in the book van Houwelingen \& Putter (2011), especially Chapter 8. More
background is given in Appendix A.2 of van Houwelingen \& Putter (2011).
Interest is in the time-dependent covariate White Blood Cell count (WBC).
Data set wbc1 contains the follow-up data and time-fixed covariates, while
wbc2
contains the WBC measurements. The included variables in
wbc1 are
Patient identification number
Time in years from randomization to death or last follow-up
Survival status; 1 = dead, 0 = censored
Clinical index based on spleen size, percentage of circulating blasts, platelet and age at diagnosis
Age at diagnosis
A data frame, see data.frame
.
Kluin-Nelemans JC, Delannoy A, Louwagie A, le Cessie S, Hermans J, van der Burgh JF, Hagemeijer AM, van den Berghe H \& Benelux CML Study Group (1998). Randomized study on hydroxyurea alone versus hydroxyurea combined with low-dose interferon-alpha 2b for chronic myeloid leukemia. Blood 91, 2713–2721.
de Bruijne MHJ, le Cessie S, Kluin-Nelemans HC \& van Houwelingen HC (2001). On the use of Cox regression in the presence of an irregularly observed time-dependent covariate. Statistics in Medicine 20, 3817–3829.
van Houwelingen HC (2007). Dynamic prediction by landmarking in event history analysis. Scandinavian Journal of Statistics 34, 70–85.
van Houwelingen HC, Putter H (2012). Dynamic Predicting in Clinical Survival Analysis. Chapman \& Hall.
A data frame of 210 patients with Chronic Myeloid Leukemia from the Benelux
CML study (Kluin-Nelemans et al. 1998). Data have been used in two
methodological papers, de Bruijne et al. (2001) and van Houwelingen (2007),
and in the book van Houwelingen \& Putter (2011), especially Chapter 8. More
background is given in Appendix A.2 of van Houwelingen \& Putter (2011).
Interest is in the time-dependent covariate White Blood Cell count (WBC).
Data set wbc1
contains the follow-up data and time-fixed
covariates, while wbc2 contains the WBC measurements. The included variables
in wbc2 are
Patient identification number
Time of WBC measurement in years from randomization
Log-transformed and standardized WBC measurement, more precisely, defined as lwbc=log10(wbc)-0.95
A data frame, see data.frame
.
Kluin-Nelemans JC, Delannoy A, Louwagie A, le Cessie S, Hermans J, van der Burgh JF, Hagemeijer AM, van den Berghe H \& Benelux CML Study Group (1998). Randomized study on hydroxyurea alone versus hydroxyurea combined with low-dose interferon-alpha 2b for chronic myeloid leukemia. Blood 91, 2713–2721.
de Bruijne MHJ, le Cessie S, Kluin-Nelemans HC \& van Houwelingen HC (2001). On the use of Cox regression in the presence of an irregularly observed time-dependent covariate. Statistics in Medicine 20, 3817–3829.
van Houwelingen HC (2007). Dynamic prediction by landmarking in event history analysis. Scandinavian Journal of Statistics 34, 70–85.
van Houwelingen HC, Putter H (2012). Dynamic Predicting in Clinical Survival Analysis. Chapman \& Hall.