Calculates the **Integrated Survival Brier Score** (ISBS), Integrated Graf Score
or squared survival loss.

## Details

For an individual who dies at time \(t\), with predicted Survival function, \(S\), the Graf Score at time \(t^*\) is given by $$L_{ISBS}(S,t|t^*) = [(S(t^*)^2)I(t \le t^*, \delta = 1)(1/G(t))] + [((1 - S(t^*))^2)I(t > t^*)(1/G(t^*))]$$ where \(G\) is the Kaplan-Meier estimate of the censoring distribution.

The re-weighted ISBS (RISBS) is given by
$$L_{RISBS}(S,t|t^*) = [(S(t^*)^2)I(t \le t^*, \delta = 1)(1/G(t))] + [((1 - S(t^*))^2)I(t > t^*)(1/G(t))]$$
where \(G\) is the Kaplan-Meier estimate of the censoring distribution, i.e. always
weighted by \(G(t)\).
RISBS is strictly proper when the censoring distribution is independent
of the survival distribution and when G is fit on a sufficiently large dataset.
ISBS is never proper. Use `proper = FALSE`

for ISBS and `proper = TRUE`

for RISBS.
Results may be very different if many observations are
censored at the last observed time due to division by 1/`eps`

in `proper = TRUE`

.

**Note**: If comparing the integrated graf score to other packages, e.g.
pec, then `method = 2`

should be used. However the results may
still be very slightly different as this package uses `survfit`

to estimate
the censoring distribution, in line with the Graf 1999 paper; whereas some
other packages use `prodlim`

with `reverse = TRUE`

(meaning Kaplan-Meier is
not used).

If `task`

and `train_set`

are passed to `$score`

then \(G(t)\) is fit on training data,
otherwise testing data. The first is likely to reduce any bias caused by calculating
parts of the measure on the test data it is evaluating. The training data is automatically
used in scoring resamplings.

If `t_max`

or `p_max`

is given, then \(G(t)\) will be fitted using **all observations** from the
train set (or test set) and only then the cutoff time will be applied.
This is to ensure that more data is used for fitting the censoring distribution via the
Kaplan-Meier.
Setting the `t_max`

can help alleviate inflation of the score when `proper`

is `TRUE`

,
in cases where an observation is censored at the last observed time point.
This results in \(G(t_{max}) = 0\) and the use of `eps`

instead (when `t_max`

is `NULL`

).

## Dictionary

This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():

## Parameters

Id | Type | Default | Levels | Range |

integrated | logical | TRUE | TRUE, FALSE | - |

times | untyped | - | - | |

t_max | numeric | - | \([0, \infty)\) | |

p_max | numeric | - | \([0, 1]\) | |

method | integer | 2 | \([1, 2]\) | |

se | logical | FALSE | TRUE, FALSE | - |

proper | logical | FALSE | TRUE, FALSE | - |

eps | numeric | 0.001 | \([0, 1]\) | |

ERV | logical | FALSE | TRUE, FALSE | - |

## Parameter details

`integrated`

(`logical(1)`

)

If`TRUE`

(default), returns the integrated score (eg across time points); otherwise, not integrated (eg at a single time point).

`times`

(`numeric()`

)

If`integrate == TRUE`

then a vector of time-points over which to integrate the score. If`integrate == FALSE`

then a single time point at which to return the score.

`t_max`

(`numeric(1)`

)

Cutoff time (i.e. time horizon) to evaluate the measure up to. Mutually exclusive with`p_max`

or`times`

. This will effectively remove test observations for which the time (event or censoring) is less than`t_max`

. It's recommended to set`t_max`

to avoid division by`eps`

, see Details.

`p_max`

(`numeric(1)`

)

The proportion of censoring to integrate up to in the given dataset. Mutually exclusive with`times`

or`t_max`

.

`method`

(`integer(1)`

)

If`integrate == TRUE`

, this selects the integration weighting method.`method == 1`

corresponds to weighting each time-point equally and taking the mean score over discrete time-points.`method == 2`

corresponds to calculating a mean weighted by the difference between time-points.`method == 2`

is the default value, to be in line with other packages.

`se`

(`logical(1)`

)

If`TRUE`

then returns standard error of the measure otherwise returns the mean across all individual scores, e.g. the mean of the per observation scores. Default is`FALSE`

(returns the mean).

`proper`

(`logical(1)`

)

If`TRUE`

then weights scores by the censoring distribution at the observed event time, which results in a strictly proper scoring rule if censoring and survival time distributions are independent and a sufficiently large dataset is used. If`FALSE`

then weights scores by the Graf method which is the more common usage but the loss is not proper.

`eps`

(`numeric(1)`

)

Very small number to substitute zero values in order to prevent errors in e.g. log(0) and/or division-by-zero calculations. Default value is 0.001.

`ERV`

(`logical(1)`

)

If`TRUE`

then the Explained Residual Variation method is applied, which means the score is standardized against a Kaplan-Meier baseline. Default is`FALSE`

.

## References

Graf E, Schmoor C, Sauerbrei W, Schumacher M (1999).
“Assessment and comparison of prognostic classification schemes for survival data.”
*Statistics in Medicine*, **18**(17-18), 2529–2545.
doi:10.1002/(sici)1097-0258(19990915/30)18:17/18<2529::aid-sim274>3.0.co;2-5
.

## See also

Other survival measures:
`mlr_measures_surv.calib_alpha`

,
`mlr_measures_surv.calib_beta`

,
`mlr_measures_surv.chambless_auc`

,
`mlr_measures_surv.cindex`

,
`mlr_measures_surv.dcalib`

,
`mlr_measures_surv.hung_auc`

,
`mlr_measures_surv.intlogloss`

,
`mlr_measures_surv.logloss`

,
`mlr_measures_surv.mae`

,
`mlr_measures_surv.mse`

,
`mlr_measures_surv.nagelk_r2`

,
`mlr_measures_surv.oquigley_r2`

,
`mlr_measures_surv.rcll`

,
`mlr_measures_surv.rmse`

,
`mlr_measures_surv.schmid`

,
`mlr_measures_surv.song_auc`

,
`mlr_measures_surv.song_tnr`

,
`mlr_measures_surv.song_tpr`

,
`mlr_measures_surv.uno_auc`

,
`mlr_measures_surv.uno_tnr`

,
`mlr_measures_surv.uno_tpr`

,
`mlr_measures_surv.xu_r2`

Other Probabilistic survival measures:
`mlr_measures_surv.intlogloss`

,
`mlr_measures_surv.logloss`

,
`mlr_measures_surv.rcll`

,
`mlr_measures_surv.schmid`

Other distr survival measures:
`mlr_measures_surv.calib_alpha`

,
`mlr_measures_surv.dcalib`

,
`mlr_measures_surv.intlogloss`

,
`mlr_measures_surv.logloss`

,
`mlr_measures_surv.rcll`

,
`mlr_measures_surv.schmid`

## Super classes

`mlr3::Measure`

-> `mlr3proba::MeasureSurv`

-> `MeasureSurvGraf`