# PipeOpTaskSurvClassifIPCW

Source:`R/PipeOpTaskSurvClassifIPCW.R`

`mlr_pipeops_trafotask_survclassif_IPCW.Rd`

Transform TaskSurv to TaskClassif using the **I**nverse
**P**robability of **C**ensoring **W**eights (IPCW) method by Vock et al. (2016).

Let \(T_i\) be the observed times (event or censoring) and \(\delta_i\) the censoring indicators for each observation \(i\) in the training set. The IPCW technique consists of two steps: first we estimate the censoring distribution \(\hat{G}(t)\) using the Kaplan-Meier estimator from the training data. Then we calculate the observation weights given a cutoff time \(\tau\) as:

$$\omega_i = 1/\hat{G}{(min(T_i,\tau))}$$

Observations that are censored prior to \(\tau\) are assigned zero weights, i.e. \(\omega_i = 0\).

## Dictionary

This PipeOp can be instantiated via the
dictionary mlr3pipelines::mlr_pipeops
or with the associated sugar function `mlr3pipelines::po()`

:

## Input and Output Channels

PipeOpTaskSurvClassifIPCW has one input channel named "input", and two output channels, one named "output" and the other "data".

Training transforms the "input" TaskSurv to a TaskClassif,
which is the "output".
The target column is named `"status"`

and indicates whether **an event occurred**
**before the cutoff time** \(\tau\) (`1`

= yes, `0`

= no).
The observed times column is removed from the "output" task.
The transformed task has the property `"weights"`

(the \(\omega_i\)).
The "data" is `NULL`

.

During prediction, the "input" TaskSurv is transformed to the "output"
TaskClassif with `"status"`

as target (again indicating
if the event occurred before the cutoff time).
The "data" is a data.table containing the observed `times`

\(T_i\) and
censoring indicators/`status`

\(\delta_i\) of each subject as well as the corresponding
`row_ids`

.
This "data" is only meant to be used with the PipeOpPredClassifSurvIPCW.

## Parameters

The parameters are

`tau`

::`numeric()`

Predefined time point for IPCW. Observations with time larger than \(\tau\) are censored. Must be less or equal to the maximum event time.`eps`

::`numeric()`

Small value to replace \(G(t) = 0\) censoring probabilities to prevent infinite weights (a warning is triggered if this happens).

## References

Vock, M D, Wolfson, Julian, Bandyopadhyay, Sunayan, Adomavicius, Gediminas, Johnson, E P, Vazquez-Benitez, Gabriela, O'Connor, J P (2016).
“Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting.”
*Journal of Biomedical Informatics*, **61**, 119–131.
doi:10.1016/j.jbi.2016.03.009
, https://www.sciencedirect.com/science/article/pii/S1532046416000496.

## Super class

`mlr3pipelines::PipeOp`

-> `PipeOpTaskSurvClassifIPCW`

## Methods

## Inherited methods

### Method `new()`

Creates a new instance of this R6 class.

#### Usage

`PipeOpTaskSurvClassifIPCW$new(id = "trafotask_survclassif_IPCW")`

## Examples

```
if (FALSE) { # \dontrun{
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
task = tsk("lung")
# split task to train and test subtasks
part = partition(task)
task_train = task$clone()$filter(part$train)
task_test = task$clone()$filter(part$test)
# define IPCW pipeop
po_ipcw = po("trafotask_survclassif_IPCW", tau = 365)
# during training, output is a classification task with weights
task_classif_train = po_ipcw$train(list(task_train))[[1]]
task_classif_train
# during prediction, output is a classification task (no weights)
task_classif_test = po_ipcw$predict(list(task_test))[[1]]
task_classif_test
# train classif learner on the train task with weights
learner = lrn("classif.rpart", predict_type = "prob")
learner$train(task_classif_train)
# predict using the test output task
p = learner$predict(task_classif_test)
# use classif measures for evaluation
p$confusion
p$score()
p$score(msr("classif.auc"))
} # }
```