This vignette shows how to use the package {icpack
},
using the drugusers
data set. A detailed description of the
methods can be found in Putter, Gampe & Eilers (2023).
The drug users data are available after loading the package.
There are three covariates (see the help file). Here is a summary of their frequencies and distribution.
##
## 1972-1980 1981-1985 1986-1991 1992-1997
## 173 374 306 87
##
## male female
## 759 181
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 9.00 17.00 19.00 20.26 23.00 49.00
The time-to-event data are interval censored. Here is a first fit.
The output of the function icfit
is an object of class
icfit
.
Information about the estimated coefficients and their significance is obtained by printing the object.
## Object of class 'icfit'
## Call:
## icfit(formula = Surv(left, right, type = "interval2") ~ period +
## gender + age, data = drugusers)
##
## Bins summary: tmin = 0, tmax = 241.4, number of bins = 100, bin width = 2.414
## P-splines summary: number of segments = 20, degree = 3, penalty order = 2
##
## Parameter estimates:
## coef SE HR lower upper pvalue
## period1981-1985 0.039268 0.114076 1.040050 0.831672 1.301 0.7307
## period1986-1991 -0.326614 0.129151 0.721362 0.560041 0.929 0.0114 *
## period1992-1997 -0.576174 0.270656 0.562044 0.330665 0.955 0.0333 *
## genderfemale 0.224810 0.109141 1.252084 1.010956 1.551 0.0394 *
## age -0.004659 0.010490 0.995352 0.975097 1.016 0.6569
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Effective baseline dimension: 1.043, log-likelihood: -2561, AIC: 5124
## Smoothness parameter lambda: 1e+06
## Number of iterations: 50
## n = 940, number of events = 597
The print method also gives basic information on the bins and splines used, as well as the effective dimension, log-likelihood, AIC, tuing parameter, number of iterations, number of subjects and evvents. A summary method is also available, which gives additional information on the iterations and on computation time.
A plot of the baseline hazard is obtained by plotting the object. The default option is to add shaded confidence intervals; they can be switched off if desired.
Plots of the cumulative hazard and the survival function are also available.
The plot function above plots the baseline hazard. Suppose we are
interested in the hazards and survival functions for a male aged 25 who
started intravenous drug use in each of the four periods. We can use
predict
with a newdata
object to obtain these
hazards and survival functions.
newd <- data.frame(period=1:4, gender=1, age=25)
newd$period <- factor(newd$period, levels=1:4, labels=levels(drugusers$period))
newd$gender <- factor(newd$gender, levels=1:2, labels=levels(drugusers$gender))
picf <- predict(icf, newdata=newd)
The result is an object of class predict.icfit
, which is
(here) a list with 4 items (one for each subject), each containing a
data frame with time points and hazards, cumulative hazards, survival
functions at these time points with associated standard errors and
confidence bounds. The summary
method can be used to
extract the hazards etc at specified time points.
## [[1]]
## time haz sehaz lowerhaz upperhaz Haz seHaz
## 1 0 0.02076260 0.002615486 0.016220153 0.02657715 0.0000000 0.00000000
## 2 12 0.01854103 0.002271171 0.014583662 0.02357226 0.2355676 0.02923304
## 3 24 0.01655717 0.001993712 0.013076447 0.02096441 0.4459306 0.05467113
## 4 36 0.01478559 0.001771208 0.011691512 0.01869849 0.6337858 0.07697656
## 5 48 0.01320357 0.001592985 0.010423064 0.01672581 0.8015416 0.09669178
## 6 60 0.01179081 0.001449637 0.009265986 0.01500361 0.9513485 0.11425790
## lowerHaz upperHaz Surv seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.1782719 0.2928633 0.7901262 0.02309683 0.7461298 0.8367173
## 3 0.3387771 0.5530840 0.6402324 0.03500122 0.5751790 0.7126440
## 4 0.4829145 0.7846571 0.5305826 0.04084164 0.4562805 0.6169848
## 5 0.6120292 0.9910540 0.4486390 0.04337919 0.3711880 0.5422510
## 6 0.7274071 1.1752898 0.3862211 0.04412852 0.3087310 0.4831611
##
## [[2]]
## time haz sehaz lowerhaz upperhaz Haz seHaz
## 1 0 0.02159413 0.002039438 0.01794505 0.02598524 0.0000000 0.00000000
## 2 12 0.01928359 0.001751007 0.01613973 0.02303985 0.2450019 0.02264612
## 3 24 0.01722028 0.001533398 0.01446252 0.02050389 0.4637899 0.04216152
## 4 36 0.01537774 0.001372813 0.01290933 0.01831815 0.6591686 0.05921658
## 5 48 0.01373236 0.001255970 0.01147873 0.01642845 0.8336430 0.07434621
## 6 60 0.01226303 0.001170780 0.01017025 0.01478645 0.9894496 0.08796912
## lowerHaz upperHaz Surv seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.2006164 0.2893875 0.7827072 0.01772452 0.7487276 0.8182293
## 3 0.3811548 0.5464249 0.6289001 0.02651460 0.5790218 0.6830754
## 4 0.5431063 0.7752310 0.5172847 0.03063122 0.4606016 0.5809436
## 5 0.6879271 0.9793589 0.4344660 0.03230050 0.3755545 0.5026187
## 6 0.8170333 1.1618659 0.3717826 0.03270515 0.3129033 0.4417413
##
## [[3]]
## time haz sehaz lowerhaz upperhaz Haz seHaz
## 1 0 0.014977353 0.0015225659 0.012271671 0.01827959 0.0000000 0.00000000
## 2 12 0.013374800 0.0013212145 0.011020526 0.01623201 0.1699295 0.01699864
## 3 24 0.011943717 0.0011666246 0.009862723 0.01446379 0.3216774 0.03180116
## 4 36 0.010665763 0.0010493594 0.008795202 0.01293416 0.4571891 0.04485204
## 5 48 0.009524553 0.0009606996 0.007816056 0.01160651 0.5782018 0.05650514
## 6 60 0.008505446 0.0008929752 0.006923579 0.01044873 0.6862668 0.06703779
## lowerHaz upperHaz Surv seSurv lowerSurv upperSurv
## 1 0.0000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.1366128 0.2032462 0.8437265 0.01434177 0.8160803 0.8723094
## 3 0.2593483 0.3840066 0.7249345 0.02305328 0.6811302 0.7715560
## 4 0.3692807 0.5450975 0.6330627 0.02839376 0.5797879 0.6912328
## 5 0.4674538 0.6889498 0.5609075 0.03169388 0.5021049 0.6265968
## 6 0.5548751 0.8176585 0.5034529 0.03375020 0.4414652 0.5741446
##
## [[4]]
## time haz sehaz lowerhaz upperhaz Haz seHaz
## 1 0 0.011669501 0.002845395 0.007236062 0.01881925 0.0000000 0.00000000
## 2 12 0.010420883 0.002536051 0.006467762 0.01679017 0.1323994 0.03224049
## 3 24 0.009305865 0.002267178 0.005772706 0.01500148 0.2506328 0.06099565
## 4 36 0.008310156 0.002032865 0.005144992 0.01342251 0.3562157 0.08669707
## 5 48 0.007420990 0.001828060 0.004579098 0.01202662 0.4505019 0.10971568
## 6 60 0.006626960 0.001648454 0.004069848 0.01079072 0.5347000 0.13037079
## lowerHaz upperHaz Surv seSurv lowerSurv upperSurv
## 1 0.00000000 0.0000000 1.0000000 0.00000000 1.0000000 1.0000000
## 2 0.06920922 0.1955896 0.8759924 0.02824176 0.8223525 0.9331318
## 3 0.13108348 0.3701820 0.7783098 0.04747272 0.6906117 0.8771450
## 4 0.18629259 0.5261389 0.7003229 0.06071527 0.5908846 0.8300311
## 5 0.23546315 0.6655407 0.6373092 0.06992232 0.5139973 0.7902051
## 6 0.27917798 0.7902221 0.5858456 0.07637686 0.4537450 0.7564055
Finally, the hazards or cumulative hazards or survival functions can be plotted.
library(gridExtra)
plot(picf, type="surv", ylim=c(0, 1),
xlab = "Months since start intravenous drug use",
title = levels(drugusers$period))
## TableGrob (2 x 2) "arrange": 4 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (2-2,1-1) arrange gtable[layout]
## 4 4 (2-2,2-2) arrange gtable[layout]
Putter, H., Gampe, J., Eilers, P. H. (2023). Smooth hazard regression for interval censored data. Submitted for publication.