Skip to contents

Calculates the Integrated Calibration Index (ICI), which evaluates point-calibration (i.e. at a specific time point), see Austin et al. (2020).

Details

Each individual \(i\) from the test set, has an observed survival outcome \((t_i, \delta_i)\) (time and censoring indicator) and predicted survival function \(S_i(t)\). The predicted probability of an event occurring before a specific time point \(t_0\), is defined as \(\hat{P_i}(t_0) = F_i(t_0) = 1 - S_i(t_0)\).

Using hazard regression (via the polspline R package), a smoothed calibration curve is estimated by fitting the following model: $$log(h(t)) = g(log(− log(1 − \hat{P}_{t_0})), t)$$

Note that we substitute probabilities \(\hat{P}_{t_0} = 0\) with a small \(\epsilon\) number to avoid arithmetic issues (\(log(0)\)). Same with \(\hat{P}_{t_0} = 1\), we use \(1 - \epsilon\). From this model, the smoothed probability of occurrence at \(t_0\) for observation \(i\) is obtained as \(\hat{P}_i^c(t_0)\).

The Integrated Calibration Index is then computed across the \(N\) test set observations as: $$ICI = \frac{1}{N} \sum_{i=1}^N | \hat{P}_i^c(t_0) - \hat{P}_i(t_0) |$$

Therefore, a perfect calibration (smoothed probabilities match predicted probabilities for all observations) yields \(ICI = 0\), while the worst possible score is \(ICI = 1\).

Dictionary

This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():

MeasureSurvICI$new()
mlr_measures$get("surv.calib_index")
msr("surv.calib_index")

Parameters

IdTypeDefaultLevelsRange
timenumeric-\([0, \infty)\)
epsnumeric1e-04\([0, 1]\)
methodcharacterICIICI, E50, E90, Emax-
na.rmlogicalTRUETRUE, FALSE-

Meta Information

  • Type: "surv"

  • Range: \([0, 1]\)

  • Minimize: TRUE

  • Required prediction: distr

Parameter details

  • eps (numeric(1))
    Very small number to substitute zero values in order to prevent errors in e.g. log(0) and/or division-by-zero calculations. Default value is 1e-04.

  • time (numeric(1))
    The specific time point \(t_0\) at which calibration is evaluated. If NULL, the median observed time from the test set is used.

  • method (character(1))
    Specifies the summary statistic used to calculate the final calibration score.

    • "ICI" (default): Uses the mean of absolute differences \(| \hat{P}_i^c(t_0) - \hat{P}_i(t_0) |\) across all observations.

    • "E50": Uses the median of absolute differences instead of the mean.

    • "E90": Uses the 90th percentile of absolute differences, emphasizing higher deviations.

    • "Emax": Uses the maximum absolute difference, capturing the largest discrepancy between predicted and smoothed probabilities.

  • na.rm (logical(1))
    If TRUE (default) then removes any NAs/NaNs in the smoothed probabilities \(\hat{P}_i^c(t_0)\) that may arise. A warning is issued nonetheless in such cases.

References

Austin, C. P, Harrell, E. F, van Klaveren, David (2020). “Graphical calibration curves and the integrated calibration index (ICI) for survival models.” Statistics in Medicine, 39(21), 2714. ISSN 10970258, doi:10.1002/SIM.8570 , https://pmc.ncbi.nlm.nih.gov/articles/PMC7497089/.

Super classes

mlr3::Measure -> mlr3proba::MeasureSurv -> MeasureSurvICI

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method clone()

The objects of this class are cloneable with this method.

Usage

MeasureSurvICI$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

library(mlr3)

# Define a survival Task
task = tsk("lung")

# Create train and test set
part = partition(task)

# Train Cox learner on the train set
cox = lrn("surv.coxph")
cox$train(task, row_ids = part$train)

# Make predictions for the test set
p = cox$predict(task, row_ids = part$test)

# ICI at median test set time
p$score(msr("surv.calib_index"))
#> surv.calib_index 
#>        0.1717185 

# ICI at specific time point
p$score(msr("surv.calib_index", time = 365))
#> surv.calib_index 
#>        0.1974633 

# E50 at specific time point
p$score(msr("surv.calib_index", method = "E50", time = 365))
#> surv.calib_index 
#>        0.1912177