PipeOpTaskSurvClassifIPCW
Source:R/PipeOpTaskSurvClassifIPCW.R
mlr_pipeops_trafotask_survclassif_IPCW.RdTransform TaskSurv to TaskClassif using the Inverse Probability of Censoring Weights (IPCW) method by Vock et al. (2016).
Let \(T_i\) be the observed times (event or censoring) and \(\delta_i\) the censoring indicators for each observation \(i\) in the training set. The IPCW technique consists of two steps: first we estimate the censoring distribution \(\hat{G}(t)\) using the Kaplan-Meier estimator from the training data. Then we calculate the observation weights given a cutoff time \(\tau\) as:
$$\omega_i = 1/\hat{G}{(min(T_i,\tau))}$$
Observations that are censored prior to \(\tau\) are assigned zero weights, i.e. \(\omega_i = 0\).
Dictionary
This PipeOp can be instantiated via the
dictionary mlr3pipelines::mlr_pipeops
or with the associated sugar function mlr3pipelines::po():
Input and Output Channels
PipeOpTaskSurvClassifIPCW has one input channel named "input", and two output channels, one named "output" and the other "data".
Training transforms the "input" TaskSurv to a TaskClassif,
which is the "output".
The target column is named "status" and indicates whether an event occurred
before the cutoff time \(\tau\) (1 = yes, 0 = no).
The observed times column is removed from the "output" task.
The transformed task has the property "weights_learner" (the \(\omega_i\)).
The "data" is NULL.
During prediction, the "input" TaskSurv is transformed to the "output"
TaskClassif with "status" as target (again indicating
if the event occurred before the cutoff time).
The "data" is a data.table containing the observed times \(T_i\) and
censoring indicators/status \(\delta_i\) of each subject as well as the corresponding
row_ids.
This "data" is only meant to be used with the PipeOpPredClassifSurvIPCW.
Parameters
The parameters are
tau::numeric()
Predefined time point for IPCW. Observations with time larger than \(\tau\) are censored. Must be less or equal to the maximum event time.eps::numeric()
Small value to replace \(G(t) = 0\) censoring probabilities to prevent infinite weights (a warning is triggered if this happens).
References
Vock, M D, Wolfson, Julian, Bandyopadhyay, Sunayan, Adomavicius, Gediminas, Johnson, E P, Vazquez-Benitez, Gabriela, O'Connor, J P (2016). “Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting.” Journal of Biomedical Informatics, 61, 119–131. doi:10.1016/j.jbi.2016.03.009 , https://www.sciencedirect.com/science/article/pii/S1532046416000496.
Super class
mlr3pipelines::PipeOp -> PipeOpTaskSurvClassifIPCW
Methods
Method new()
Creates a new instance of this R6 class.
Usage
PipeOpTaskSurvClassifIPCW$new(id = "trafotask_survclassif_IPCW")Examples
if (FALSE) { # \dontrun{
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
task = tsk("lung")
# split task to train and test subtasks
part = partition(task)
task_train = task$clone()$filter(part$train)
task_test = task$clone()$filter(part$test)
# define IPCW pipeop
po_ipcw = po("trafotask_survclassif_IPCW", tau = 365)
# during training, output is a classification task with weights
task_classif_train = po_ipcw$train(list(task_train))[[1]]
task_classif_train
# during prediction, output is a classification task (no weights)
task_classif_test = po_ipcw$predict(list(task_test))[[1]]
task_classif_test
# train classif learner on the train task with weights
learner = lrn("classif.rpart", predict_type = "prob")
learner$train(task_classif_train)
# predict using the test output task
p = learner$predict(task_classif_test)
# use classif measures for evaluation
p$confusion
p$score()
p$score(msr("classif.auc"))
} # }