Smooth modeling of interval-censored survival data with {icpack}

Introduction

This vignette shows how to use the package {icpack}, using the drugusers data set. A detailed description of the methods can be found in Putter, Gampe & Eilers (2023).

Drug user data

The drug users data are available after loading the package.

library(icpack)
library(survival)

Preliminaries

There are three covariates (see the help file). Here is a summary of their frequencies and distribution.

table(drugusers$period)
## 
## 1972-1980 1981-1985 1986-1991 1992-1997 
##       173       374       306        87
table(drugusers$gender)
## 
##   male female 
##    759    181
summary(drugusers$age)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    9.00   17.00   19.00   20.26   23.00   49.00

Using icfit

The time-to-event data are interval censored. Here is a first fit. The output of the function icfit is an object of class icfit.

icf <- icfit(Surv(left, right, type = "interval2") ~ period + gender + age, data = drugusers)

Information about the estimated coefficients and their significance is obtained by printing the object.

print(icf)
## Object of class 'icfit'
## Call:
## icfit(formula = Surv(left, right, type = "interval2") ~ period + 
##     gender + age, data = drugusers)
## 
## Bins summary: tmin = 0, tmax = 241.4, number of bins = 100, bin width = 2.414
## P-splines summary: number of segments = 20, degree = 3, penalty order = 2
## 
## Parameter estimates:
##                      coef        SE        HR     lower upper pvalue  
## period1981-1985  0.039268  0.114076  1.040050  0.831672 1.301 0.7307  
## period1986-1991 -0.326614  0.129151  0.721362  0.560041 0.929 0.0114 *
## period1992-1997 -0.576174  0.270656  0.562044  0.330665 0.955 0.0333 *
## genderfemale     0.224810  0.109141  1.252084  1.010956 1.551 0.0394 *
## age             -0.004659  0.010490  0.995352  0.975097 1.016 0.6569  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Effective baseline dimension: 1.043, log-likelihood: -2561, AIC: 5124
## Smoothness parameter lambda: 1e+06
## Number of iterations: 50
## n = 940, number of events = 597

The print method also gives basic information on the bins and splines used, as well as the effective dimension, log-likelihood, AIC, tuing parameter, number of iterations, number of subjects and evvents. A summary method is also available, which gives additional information on the iterations and on computation time.

Plotting

A plot of the baseline hazard is obtained by plotting the object. The default option is to add shaded confidence intervals; they can be switched off if desired.

plot(icf, title = 'Drug users')

Plots of the cumulative hazard and the survival function are also available.

plot(icf, type = 'cumhazard', title = 'Drug users')

plot(icf, type = 'survival', title = 'Drug users')

Predicting

The plot function above plots the baseline hazard. Suppose we are interested in the hazards and survival functions for a male aged 25 who started intravenous drug use in each of the four periods. We can use predict with a newdata object to obtain these hazards and survival functions.

newd <- data.frame(period=1:4, gender=1, age=25)
newd$period <- factor(newd$period, levels=1:4, labels=levels(drugusers$period))
newd$gender <- factor(newd$gender, levels=1:2, labels=levels(drugusers$gender))
picf <- predict(icf, newdata=newd)

The result is an object of class predict.icfit, which is (here) a list with 4 items (one for each subject), each containing a data frame with time points and hazards, cumulative hazards, survival functions at these time points with associated standard errors and confidence bounds. The summary method can be used to extract the hazards etc at specified time points.

summary(picf, times=seq(0, 60, by=12))
## [[1]]
##   time        haz       sehaz    lowerhaz   upperhaz       Haz      seHaz
## 1    0 0.02076260 0.002615486 0.016220153 0.02657715 0.0000000 0.00000000
## 2   12 0.01854103 0.002271171 0.014583662 0.02357226 0.2355676 0.02923304
## 3   24 0.01655717 0.001993712 0.013076447 0.02096441 0.4459306 0.05467113
## 4   36 0.01478559 0.001771208 0.011691512 0.01869849 0.6337858 0.07697656
## 5   48 0.01320357 0.001592985 0.010423064 0.01672581 0.8015416 0.09669178
## 6   60 0.01179081 0.001449637 0.009265986 0.01500361 0.9513485 0.11425790
##    lowerHaz  upperHaz      Surv     seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.1782719 0.2928633 0.7901262 0.02309683 0.7461298 0.8367173
## 3 0.3387771 0.5530840 0.6402324 0.03500122 0.5751790 0.7126440
## 4 0.4829145 0.7846571 0.5305826 0.04084164 0.4562805 0.6169848
## 5 0.6120292 0.9910540 0.4486390 0.04337919 0.3711880 0.5422510
## 6 0.7274071 1.1752898 0.3862211 0.04412852 0.3087310 0.4831611
## 
## [[2]]
##   time        haz       sehaz   lowerhaz   upperhaz       Haz      seHaz
## 1    0 0.02159413 0.002039438 0.01794505 0.02598524 0.0000000 0.00000000
## 2   12 0.01928359 0.001751007 0.01613973 0.02303985 0.2450019 0.02264612
## 3   24 0.01722028 0.001533398 0.01446252 0.02050389 0.4637899 0.04216152
## 4   36 0.01537774 0.001372813 0.01290933 0.01831815 0.6591686 0.05921658
## 5   48 0.01373236 0.001255970 0.01147873 0.01642845 0.8336430 0.07434621
## 6   60 0.01226303 0.001170780 0.01017025 0.01478645 0.9894496 0.08796912
##    lowerHaz  upperHaz      Surv     seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.2006164 0.2893875 0.7827072 0.01772452 0.7487276 0.8182293
## 3 0.3811548 0.5464249 0.6289001 0.02651460 0.5790218 0.6830754
## 4 0.5431063 0.7752310 0.5172847 0.03063122 0.4606016 0.5809436
## 5 0.6879271 0.9793589 0.4344660 0.03230050 0.3755545 0.5026187
## 6 0.8170333 1.1618659 0.3717826 0.03270515 0.3129033 0.4417413
## 
## [[3]]
##   time         haz        sehaz    lowerhaz   upperhaz       Haz      seHaz
## 1    0 0.014977353 0.0015225659 0.012271671 0.01827959 0.0000000 0.00000000
## 2   12 0.013374800 0.0013212145 0.011020526 0.01623201 0.1699295 0.01699864
## 3   24 0.011943717 0.0011666246 0.009862723 0.01446379 0.3216774 0.03180116
## 4   36 0.010665763 0.0010493594 0.008795202 0.01293416 0.4571891 0.04485204
## 5   48 0.009524553 0.0009606996 0.007816056 0.01160651 0.5782018 0.05650514
## 6   60 0.008505446 0.0008929752 0.006923579 0.01044873 0.6862668 0.06703779
##    lowerHaz  upperHaz      Surv     seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.1366128 0.2032462 0.8437265 0.01434177 0.8160803 0.8723094
## 3 0.2593483 0.3840066 0.7249345 0.02305328 0.6811302 0.7715560
## 4 0.3692807 0.5450975 0.6330627 0.02839376 0.5797879 0.6912328
## 5 0.4674538 0.6889498 0.5609075 0.03169388 0.5021049 0.6265968
## 6 0.5548751 0.8176585 0.5034529 0.03375020 0.4414652 0.5741446
## 
## [[4]]
##   time         haz       sehaz    lowerhaz   upperhaz       Haz      seHaz
## 1    0 0.011669501 0.002845395 0.007236062 0.01881925 0.0000000 0.00000000
## 2   12 0.010420883 0.002536051 0.006467762 0.01679017 0.1323994 0.03224049
## 3   24 0.009305865 0.002267178 0.005772706 0.01500148 0.2506328 0.06099565
## 4   36 0.008310156 0.002032865 0.005144992 0.01342251 0.3562157 0.08669707
## 5   48 0.007420990 0.001828060 0.004579098 0.01202662 0.4505019 0.10971568
## 6   60 0.006626960 0.001648454 0.004069848 0.01079072 0.5347000 0.13037079
##     lowerHaz  upperHaz      Surv     seSurv lowerSurv upperSurv
## 1 0.00000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.06920922 0.1955896 0.8759924 0.02824176 0.8223525 0.9331318
## 3 0.13108348 0.3701820 0.7783098 0.04747272 0.6906117 0.8771450
## 4 0.18629259 0.5261389 0.7003229 0.06071527 0.5908846 0.8300311
## 5 0.23546315 0.6655407 0.6373092 0.06992232 0.5139973 0.7902051
## 6 0.27917798 0.7902221 0.5858456 0.07637686 0.4537450 0.7564055

Finally, the hazards or cumulative hazards or survival functions can be plotted.

library(gridExtra)
plot(picf, type="surv", ylim=c(0, 1),
     xlab = "Months since start intravenous drug use",
     title = levels(drugusers$period))

## TableGrob (2 x 2) "arrange": 4 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (2-2,1-1) arrange gtable[layout]
## 4 4 (2-2,2-2) arrange gtable[layout]

References

Putter, H., Gampe, J., Eilers, P. H. (2023). Smooth hazard regression for interval censored data. Submitted for publication.