Estimates (or 'composes') a survival distribution from a predicted baseline distr and a crank or lp from two PredictionSurvs.

Compositor Assumptions:

  • The baseline distr is a discrete estimator, i.e. LearnerSurvKaplan or LearnerSurvNelson

  • The composed distr is of a linear form

  • If lp is missing then crank is equivalent

These assumptions are strong and may not be reasonable. Future updates will upgrade this compositor to be more flexible.


R6Class inheriting from PipeOp.


PipeOpDistrCompositor$new(id = "distrcompose", param_vals = list())
  • id :: character(1)
    Identifier of the resulting object, default "distrcompose".

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

PipeOpDistrCompositor has two input channels, "base" and "pred". Both input channels take NULL during training and PredictionSurv during prediction.

PipeOpDistrCompositor has one output channel named "output", producing NULL during training and a PredictionSurv during prediction.

The output during prediction is the PredictionSurv from the "pred" input but with an extra (or overwritten) column for distr predict type; which is composed from the distr of "base" and lp or crank of "pred".


The $state is left empty (list()).


The parameters are:

  • form :: character(1)
    Determines the form that the predicted linear survival model should take. This is either, accelerated-failure time, aft, proportional hazards, ph, or proportional odds, po. Default aft.

  • overwrite :: logical(1)
    If FALSE (default) then if the "pred" input already has a distr, the compositor does nothing and returns the given PredictionSurv. If TRUE then the distr is overwritten with the distr composed from lp/crank - this is useful for changing the prediction distr from one model form to another.


The respective forms above have respective survival distributions: $$aft: S(t) = S_0(\frac{t}{exp(lp)})$$ $$ph: S(t) = S_0(t)^{exp(lp)}$$ $$po: S(t) = \frac{S_0(t)}{exp(-lp) + (1-exp(-lp)) S_0(t)}$$ where \(S_0\) is the estimated baseline survival distribution, and \(lp\) is the predicted linear predictor. If the input model does not predict a linear predictor then crank is assumed to be the lp - this may be a strong and unreasonable assumption.


Only fields inherited from PipeOp.


Only methods inherited from PipeOp.

See also


library(mlr3) library(mlr3pipelines) set.seed(42) # Three methods to transform the cox ph predicted `distr` to an # accelerated failure time model task = tgen("simsurv")$generate(30) # Method 1 - Train and predict separately then compose base = lrn("surv.kaplan")$train(task)$predict(task) pred = lrn("surv.coxph")$train(task)$predict(task) pod = po("distrcompose", param_vals = list(form = "aft", overwrite = TRUE)) pod$predict(list(base = base, pred = pred))
#> $output #> <PredictionSurv> for 30 observations: #> row_id time status crank distr lp #> 1 3.3190202 TRUE -0.17372647 <VectorDistribution> -0.17372647 #> 2 0.1632683 TRUE -0.15401996 <VectorDistribution> -0.15401996 #> 3 3.0805302 TRUE 0.52964550 <VectorDistribution> 0.52964550 #> --- #> 28 2.7789159 TRUE -0.17421424 <VectorDistribution> -0.17421424 #> 29 3.3710556 TRUE 0.06947420 <VectorDistribution> 0.06947420 #> 30 4.1414291 TRUE 0.01427538 <VectorDistribution> 0.01427538 #>
# Examples not run to save run-time. if (FALSE) { # Method 2 - Create a graph manually gr = Graph$new()$ add_pipeop(po("learner", lrn("surv.kaplan")))$ add_pipeop(po("learner", lrn("surv.glmnet")))$ add_pipeop(po("distrcompose"))$ add_edge("surv.kaplan", "distrcompose", dst_channel = "base")$ add_edge("surv.glmnet", "distrcompose", dst_channel = "pred") gr$train(task)$gr$predict(task) # Method 3 - Syntactic sugar: Wrap the learner in a graph. cvglm.distr = distrcompositor(learner = lrn("surv.cvglmnet"), estimator = "kaplan", form = "aft") cvglm.distr$fit(task)$predict(task) }