Skip to contents

Transform TaskSurv to TaskClassif by dividing continuous time into multiple time intervals for each observation. This transformation creates a new target variable disc_status that indicates whether an event occurred within each time interval. This approach facilitates survival analysis within a classification framework using discrete time intervals (Tutz et al. 2016).

Dictionary

This PipeOp can be instantiated via the dictionary mlr3pipelines::mlr_pipeops or with the associated sugar function mlr3pipelines::po():

PipeOpTaskSurvClassifDiscTime$new()
mlr_pipeops$get("trafotask_survclassif_disctime")
po("trafotask_survclassif_disctime")

Input and Output Channels

PipeOpTaskSurvClassifDiscTime has one input channel named "input", and two output channels, one named "output" and the other "transformed_data".

During training, the "output" is the "input" TaskSurv transformed to a TaskClassif. The target column is named "disc_status" and indicates whether an event occurred in each time interval. An additional feature named "tend" contains the end time point of each interval. Lastly, the "output" task has a column with the original observation ids, under the role "original_ids". The "transformed_data" is an empty data.table.

During prediction, the "input" TaskSurv is transformed to the "output" TaskClassif with "disc_status" as target and the "tend" feature included. The "transformed_data" is a data.table with columns the "disc_status" target of the "output" task, the "id" (original observation ids), "obs_times" (observed times per "id") and "tend" (end time of each interval). This "transformed_data" is only meant to be used with the PipeOpPredClassifSurvDiscTime.

State

The $state contains information about the cut parameter used.

Parameters

The parameters are

  • cut :: numeric()
    Split points, used to partition the data into intervals based on the time column. If unspecified, all unique event times will be used. If cut is a single integer, it will be interpreted as the number of equidistant intervals from 0 until the maximum event time.

  • max_time :: numeric(1)
    If cut is unspecified, this will be the last possible event time. All event times after max_time will be administratively censored at max_time. Needs to be greater than the minimum event time in the given task.

References

Tutz, Gerhard, Schmid, Matthias (2016). Modeling Discrete Time-to-Event Data, series Springer Series in Statistics. Springer International Publishing. ISBN 978-3-319-28156-8 978-3-319-28158-2, http://link.springer.com/10.1007/978-3-319-28158-2.

Super class

mlr3pipelines::PipeOp -> PipeOpTaskSurvClassifDiscTime

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage

PipeOpTaskSurvClassifDiscTime$new(id = "trafotask_survclassif_disctime")

Arguments

id

(character(1))
Identifier of the resulting object.


Method clone()

The objects of this class are cloneable with this method.

Usage

PipeOpTaskSurvClassifDiscTime$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

if (FALSE) { # \dontrun{
if (requireNamespace("mlr3pipelines", quietly = TRUE) &&
    requireNamespace("mlr3learners", quietly = TRUE)) {
  library(mlr3)
  library(mlr3learners)
  library(mlr3pipelines)

  task = tsk("lung")

  # transform the survival task to a classification task
  # all unique event times are used as cutpoints
  po_disc = po("trafotask_survclassif_disctime")
  task_classif = po_disc$train(list(task))[[1L]]

  # the end time points of the discrete time intervals
  unique(task_classif$data(cols = "tend"))[[1L]]

  # train a classification learner
  learner = lrn("classif.log_reg", predict_type = "prob")
  learner$train(task_classif)
}
} # }