Estimates (or 'composes') a survival distribution from a predicted baseline
survival distribution (distr) and a linear predictor (lp) from two PredictionSurvs.
Compositor Assumptions:
The baseline
distris a discrete estimator, e.g. surv.kaplan.The composed
distris of a linear form
Dictionary
This PipeOp can be instantiated via the
dictionary mlr3pipelines::mlr_pipeops or with the associated sugar
function mlr3pipelines::po():
Input and Output Channels
PipeOpDistrCompositor has two input channels, "base" and "pred".
Both input channels take NULL during training and PredictionSurv during prediction.
PipeOpDistrCompositor has one output channel named "output", producing
NULL during training and a PredictionSurv during prediction.
The output during prediction is the PredictionSurv from the "pred" input
but with an extra (or overwritten) column for the distr predict type; which
is composed from the distr of "base" and the lp of "pred".
If no lp predictions have been made or exist, then the "pred" is returned unchanged.
State
The $state is left empty (list()).
Parameters
The parameters are:
form::character(1)
Determines the form that the predicted linear survival model should take. This is either, accelerated-failure time,aft, proportional hazards,ph, or proportional odds,po. Defaultaft.overwrite::logical(1)
IfFALSE(default) then if the "pred" input already has adistr, the compositor does nothing and returns the given PredictionSurv. IfTRUE, then thedistris overwritten with thedistrcomposed fromlp- this is useful for changing the predictiondistrfrom one model form to another.scale_lp::logical(1)
This option is only applicable toformequal to"aft". IfTRUE, it min-max scales the linear prediction scores to be in the interval \([0,1]\), avoiding extrapolation of the baseline \(S_0(t)\) on the transformed time points \(\frac{t}{\exp(lp)}\), as these will be \(\in [\frac{t}{e}, t]\), and so always smaller than the maximum time point for which we have estimated \(S_0(t)\). Note that this is just a heuristic to get reasonable results in the case you observe survival predictions to be e.g. constant after the AFT composition and it definitely provides no guarantee for creating calibrated distribution predictions (as none of these methods do). Therefore, it is set toFALSEby default.
Internals
The respective forms above have respective survival distributions:
$$aft: S(t) = S_0(\frac{t}{\exp(lp)})$$
$$ph: S(t) = S_0(t)^{\exp(lp)}$$
$$po: S(t) = \frac{S_0(t)}{\exp(-lp) + (1-\exp(-lp)) S_0(t)}$$
where \(S_0\) is the estimated baseline survival distribution, and \(lp\) is the
predicted linear predictor.
For an example use of the "aft" composition using Kaplan-Meier as a baseline
distribution, see Norman et al. (2024).
References
Norman, A P, Li, Wanlu, Jiang, Wenyu, Chen, E B (2024). “deepAFT: A nonlinear accelerated failure time model with artificial neural network.” Statistics in Medicine. doi:10.1002/sim.10152 .
See also
Other survival compositors:
mlr_pipeops_compose_breslow_distr,
mlr_pipeops_crankcompose,
mlr_pipeops_responsecompose
Super class
mlr3pipelines::PipeOp -> PipeOpDistrCompositor
Methods
Method new()
Creates a new instance of this R6 class.
Usage
PipeOpDistrCompositor$new(id = "distrcompose", param_vals = list())Arguments
id(
character(1))
Identifier of the resulting object.param_vals(
list())
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction.
Examples
if (FALSE) { # \dontrun{
library(mlr3)
library(mlr3pipelines)
task = tsk("rats")
base = lrn("surv.kaplan")$train(task)$predict(task)
pred = lrn("surv.coxph")$train(task)$predict(task)
# let's change the distribution prediction of Cox (Breslow-based) to an AFT form:
pod = po("distrcompose", param_vals = list(form = "aft", overwrite = TRUE))
pod$train(list(NULL, NULL)) # need to train first, even if nothing happens
pod$predict(list(base = base, pred = pred))[[1]]
} # }