Estimates (or 'composes') a survival distribution from a predicted baseline distr and a crank or lp from two PredictionSurvs.

Compositor Assumptions:

• The baseline distr is a discrete estimator, i.e. LearnerSurvKaplan or LearnerSurvNelson

• The composed distr is of a linear form

• If lp is missing then crank is equivalent

These assumptions are strong and may not be reasonable. Future updates will upgrade this compositor to be more flexible.

## Format

R6Class inheriting from PipeOp.

## Parameters

The parameters are:

• form :: character(1)
Determines the form that the predicted linear survival model should take. This is either, accelerated-failure time, aft, proportional hazards, ph, or proportional odds, po. Default aft.

• overwrite :: logical(1)
If FALSE (default) then if the "pred" input already has a distr, the compositor does nothing and returns the given PredictionSurv. If TRUE then the distr is overwritten with the distr composed from lp/crank - this is useful for changing the prediction distr from one model form to another.

## Internals

The respective forms above have respective survival distributions: $$aft: S(t) = S_0(\frac{t}{exp(lp)})$$ $$ph: S(t) = S_0(t)^{exp(lp)}$$ $$po: S(t) = \frac{S_0(t)}{exp(-lp) + (1-exp(-lp)) S_0(t)}$$ where $$S_0$$ is the estimated baseline survival distribution, and $$lp$$ is the predicted linear predictor. If the input model does not predict a linear predictor then crank is assumed to be the lp - this may be a strong and unreasonable assumption.

## Fields

Only fields inherited from PipeOp.

## Methods

Only methods inherited from PipeOp.

## Examples

library(mlr3)
library(mlr3pipelines)
set.seed(42)

# Three methods to transform the cox ph predicted distr to an
#  accelerated failure time model
task = tgen("simsurv")$generate(30) # Method 1 - Train and predict separately then compose base = lrn("surv.kaplan")$train(task)$predict(task) pred = lrn("surv.coxph")$train(task)$predict(task) pod = po("distrcompose", param_vals = list(form = "aft", overwrite = TRUE)) pod$predict(list(base = base, pred = pred))#> $output #> <PredictionSurv> for 30 observations: #> row_id time status crank distr lp #> 1 3.3190202 TRUE -0.17372647 <VectorDistribution> -0.17372647 #> 2 0.1632683 TRUE -0.15401996 <VectorDistribution> -0.15401996 #> 3 3.0805302 TRUE 0.52964550 <VectorDistribution> 0.52964550 #> --- #> 28 2.7789159 TRUE -0.17421424 <VectorDistribution> -0.17421424 #> 29 3.3710556 TRUE 0.06947420 <VectorDistribution> 0.06947420 #> 30 4.1414291 TRUE 0.01427538 <VectorDistribution> 0.01427538 #> # Examples not run to save run-time. if (FALSE) { # Method 2 - Create a graph manually gr = Graph$new()$add_pipeop(po("learner", lrn("surv.kaplan")))$
add_pipeop(po("learner", lrn("surv.glmnet")))$add_pipeop(po("distrcompose"))$
add_edge("surv.kaplan", "distrcompose", dst_channel = "base")$add_edge("surv.glmnet", "distrcompose", dst_channel = "pred") gr$train(task)$gr$predict(task)

# Method 3 - Syntactic sugar: Wrap the learner in a graph.
cvglm.distr = distrcompositor(learner = lrn("surv.cvglmnet"),
estimator = "kaplan",
form = "aft")
cvglm.distr$fit(task)$predict(task)
}