Survival to Regression Reduction Pipeline — mlr_graphs

Wrapper around multiple PipeOps to help in creation of complex survival reduction methods. Three reductions are currently implemented, see details.

Usage

pipeline_survtoregr(
  method = 1,
  regr_learner = lrn("regr.featureless"),
  distrcompose = TRUE,
  distr_estimator = lrn("surv.kaplan"),
  regr_se_learner = NULL,
  surv_learner = lrn("surv.coxph"),
  survregr_params = list(method = "ipcw", estimator = "kaplan", alpha = 1),
  distrcompose_params = list(form = "aft"),
  probregr_params = list(dist = "Uniform"),
  learnercv_params = list(resampling.method = "insample"),
  graph_learner = FALSE
)

Arguments

method: (integer(1))
Reduction method to use, corresponds to those in details. Default is 1.
regr_learner: LearnerRegr
Regression learner to fit to the transformed TaskRegr. If regr_se_learner is NULL in method 2, then regr_learner must have se predict_type.
distrcompose: (logical(1))
For method 3 if TRUE (default) then PipeOpDistrCompositor is utilised to transform the deterministic predictions to a survival distribution.
distr_estimator: LearnerSurv
For methods 1 and 3 if distrcompose = TRUE then specifies the learner to estimate the baseline hazard, must have predict_type distr.
regr_se_learner: LearnerRegr
For method 2 if regr_learner is not used to predict the se then a LearnerRegr with se predict_type must be provided.
surv_learner: LearnerSurv
For method 3, a LearnerSurv with lp predict type to estimate linear predictors.
survregr_params: (list())
Parameters passed to PipeOpTaskSurvRegr, default are survival to regression transformation via ipcw, with weighting determined by Kaplan-Meier and no additional penalty for censoring.
distrcompose_params: (list())
Parameters passed to PipeOpDistrCompositor, default is accelerated failure time model form.
probregr_params: (list())
Parameters passed to PipeOpProbregr, default is Uniform distribution for composition.
learnercv_params: (list())
Parameters passed to PipeOpLearnerCV, default is to use insampling.
graph_learner: (logical(1))
If TRUE returns wraps the Graph as a GraphLearner otherwise (default) returns as a Graph.

Details

Three reduction strategies are implemented, these are:

Survival to Deterministic Regression A
1. PipeOpTaskSurvRegr Converts TaskSurv to TaskRegr.
2. A LearnerRegr is fit and predicted on the new TaskRegr.
3. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
Survival to Probabilistic Regression
1. PipeOpTaskSurvRegr Converts TaskSurv to TaskRegr.
2. A LearnerRegr is fit on the new TaskRegr to predict response, optionally a second LearnerRegr can be fit to predict se.
3. PipeOpProbregr composes a distr prediction from the learner(s).
4. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
Survival to Deterministic Regression B
1. PipeOpLearnerCV cross-validates and makes predictions from a linear LearnerSurv with lp predict type on the original TaskSurv.
2. PipeOpTaskSurvRegr transforms the lp predictions into the target of a TaskRegr with the same features as the original TaskSurv.
3. A LearnerRegr is fit and predicted on the new TaskRegr.
4. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
5. Optionally: PipeOpDistrCompositor is used to compose a distr predict_type from the predicted lp predict_type.

Interpretation:

Once a dataset has censoring removed (by a given method) then a regression learner can predict the survival time as the response.
This is a very similar reduction to the first method with the main difference being the distribution composition. In the first case this is composed in a survival framework by assuming a linear model form and baseline hazard estimator, in the second case the composition is in a regression framework. The latter case could result in problematic negative predictions and should therefore be interpreted with caution, however a wider choice of distributions makes it a more flexible composition.
This is a rarer use-case that bypasses censoring not be removing it but instead by first predicting the linear predictor from a survival model and fitting a regression model on these predictions. The resulting regression predictions can then be viewed as the linear predictors of the new data, which can ultimately be composed to a distribution.

Examples

if (FALSE) { # \dontrun{
  library(mlr3)
  library(mlr3pipelines)

  task = tsk("rats")

  # method 1 with censoring deletion, compose to distribution
  pipe = ppl(
    "survtoregr",
    method = 1,
    regr_learner = lrn("regr.featureless"),
    survregr_params = list(method = "delete")
  )
  pipe$train(task)
  pipe$predict(task)

  # method 2 with censoring imputation (mrl), one regr learner
  pipe = ppl(
    "survtoregr",
    method = 2,
    regr_learner = lrn("regr.featureless", predict_type = "se"),
    survregr_params = list(method = "mrl")
  )
  pipe$train(task)
  pipe$predict(task)

  # method 3 with censoring omission and no composition, insample resampling
  pipe = ppl(
    "survtoregr",
    method = 3,
    regr_learner = lrn("regr.featureless"),
    distrcompose = FALSE,
    surv_learner = lrn("surv.coxph"),
    survregr_params = list(method = "omission")
  )
  pipe$train(task)
  pipe$predict(task)
} # }