PipeOpTaskSurvClassifIPCW
Source:R/PipeOpTaskSurvClassifIPCW.R
mlr_pipeops_trafotask_survclassif_IPCW.Rd
Transform TaskSurv to TaskClassif using the Inverse Probability of Censoring Weights (IPCW) method by Vock et al. (2016).
Let \(T_i\) be the observed times (event or censoring) and \(\delta_i\) the censoring indicators for each observation \(i\) in the training set. The IPCW technique consists of two steps: first we estimate the censoring distribution \(\hat{G}(t)\) using the Kaplan-Meier estimator from the training data. Then we calculate the observation weights given a cutoff time \(\tau\) as:
$$\omega_i = 1/\hat{G}{(min(T_i,\tau))}$$
Observations that are censored prior to \(\tau\) are assigned zero weights, i.e. \(\omega_i = 0\).
Dictionary
This PipeOp can be instantiated via the
dictionary mlr3pipelines::mlr_pipeops
or with the associated sugar function mlr3pipelines::po()
:
Input and Output Channels
PipeOpTaskSurvClassifIPCW has one input channel named "input", and two output channels, one named "output" and the other "data".
Training transforms the "input" TaskSurv to a TaskClassif,
which is the "output".
The target column is named "status"
and indicates whether an event occurred
before the cutoff time \(\tau\) (1
= yes, 0
= no).
The observed times column is removed from the "output" task.
The transformed task has the property "weights"
(the \(\omega_i\)).
The "data" is NULL
.
During prediction, the "input" TaskSurv is transformed to the "output"
TaskClassif with "status"
as target (again indicating
if the event occurred before the cutoff time).
The "data" is a data.table containing the observed times
\(T_i\) and
censoring indicators/status
\(\delta_i\) of each subject as well as the corresponding
row_ids
.
This "data" is only meant to be used with the PipeOpPredClassifSurvIPCW.
Parameters
The parameters are
tau
::numeric()
Predefined time point for IPCW. Observations with time larger than \(\tau\) are censored. Must be less or equal to the maximum event time.eps
::numeric()
Small value to replace \(G(t) = 0\) censoring probabilities to prevent infinite weights (a warning is triggered if this happens).
References
Vock, M D, Wolfson, Julian, Bandyopadhyay, Sunayan, Adomavicius, Gediminas, Johnson, E P, Vazquez-Benitez, Gabriela, O'Connor, J P (2016). “Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting.” Journal of Biomedical Informatics, 61, 119–131. doi:10.1016/j.jbi.2016.03.009 , https://www.sciencedirect.com/science/article/pii/S1532046416000496.
Super class
mlr3pipelines::PipeOp
-> PipeOpTaskSurvClassifIPCW
Methods
Method new()
Creates a new instance of this R6 class.
Usage
PipeOpTaskSurvClassifIPCW$new(id = "trafotask_survclassif_IPCW")
Examples
if (FALSE) { # \dontrun{
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
task = tsk("lung")
# split task to train and test subtasks
part = partition(task)
task_train = task$clone()$filter(part$train)
task_test = task$clone()$filter(part$test)
# define IPCW pipeop
po_ipcw = po("trafotask_survclassif_IPCW", tau = 365)
# during training, output is a classification task with weights
task_classif_train = po_ipcw$train(list(task_train))[[1]]
task_classif_train
# during prediction, output is a classification task (no weights)
task_classif_test = po_ipcw$predict(list(task_test))[[1]]
task_classif_test
# train classif learner on the train task with weights
learner = lrn("classif.rpart", predict_type = "prob")
learner$train(task_classif_train)
# predict using the test output task
p = learner$predict(task_classif_test)
# use classif measures for evaluation
p$confusion
p$score()
p$score(msr("classif.auc"))
} # }