Negative Log-Likelihood Survival Measure
Source:R/MeasureSurvLogloss.R
mlr_measures_surv.logloss.Rd
Calculates the cross-entropy, or negative log-likelihood (NLL) or logarithmic (log) loss.
Details
The (observation-wise) Log-Likelihood is defined as the negative logarithm of the predicted probability density function \(f_i\), evaluated at the observation time \(t_i\) (event or censoring):
$$L_{NLL}(S_i,t_i) = -\log{[f_i(t_i)]}$$
This loss does not take into account the censoring status of an observation, treating all outcomes as events, and is also an improper scoring rule, see Sonabend et al. (2024). See section Interpolation for implementation details.
To get a single score across all \(N\) observations of the test set, we return the average of the observation-wise scores:
$$\sum_{i=1}^N L_{NLL}(S_i, t_i) / N$$
Dictionary
This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():
Parameter details
eps
(numeric(1)
)
Very small number to substitute near-zero values in order to prevent errors in e.g. log(0) and/or division-by-zero calculations. Default value is 1e-06.
ERV
(logical(1)
)
IfTRUE
then the Explained Residual Variation method is applied, which means the score is standardized against a Kaplan-Meier baseline. Default isFALSE
.
Interpolation
To evaluate scores involving subject-specific survival functions \(S_i(t)\), we perform linear interpolation on the discrete survival values provided in the prediction. Duplicate survival values are removed prior to interpolation to ensure strict monotonicity and non-negative density values. Therefore we are left with the distinct survival time points \(t_0 < \cdots < t_n\) and the corresponding survival values \(S(t_j)\).
Interpolation is performed using base R’s approx()
with method = "linear"
and rule = 2
, ensuring:
Left extrapolation (for \(t < t_0\)) assumes \(S(0) = 1\) and uses the slope from \((0, 1)\) to \((t_0, S(t_0))\).
Right extrapolation (for \(t > t_n\)) uses the slope from the last interval \((t_{n-1}, S(t_{n-1}))\) to \((t_n, S(t_n))\), with results truncated at 0 to preserve non-negativity.
This ensures a continuous, piecewise-linear survival function \(S(t)\) that satisfies \(S(0) = 1\) and remains non-increasing and non-negative across the entire domain.
The density at time point \(t_k\), with \(t_i \le t_k < t_{i+1}\), is estimated as follows:
$$ f_i(t_k) = -\frac{S_i(t_{i+1}) - S_i(t_i)}{t_{i+1} - t_i} $$
This corresponds to the (negative) slope of the \(S_i(t)\) between the closest grid point after \(t_i\) and \(t_i\) itself.
References
Sonabend, Raphael, Zobolas, John, Kopper, Philipp, Burk, Lukas, Bender, Andreas (2024). “Examining properness in the external validation of survival models with squared and logarithmic losses.” https://arxiv.org/abs/2212.05260v3.
See also
Other survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_beta
,
mlr_measures_surv.calib_index
,
mlr_measures_surv.chambless_auc
,
mlr_measures_surv.cindex
,
mlr_measures_surv.dcalib
,
mlr_measures_surv.graf
,
mlr_measures_surv.hung_auc
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.mae
,
mlr_measures_surv.mse
,
mlr_measures_surv.nagelk_r2
,
mlr_measures_surv.oquigley_r2
,
mlr_measures_surv.rcll
,
mlr_measures_surv.rmse
,
mlr_measures_surv.schmid
,
mlr_measures_surv.song_auc
,
mlr_measures_surv.song_tnr
,
mlr_measures_surv.song_tpr
,
mlr_measures_surv.uno_auc
,
mlr_measures_surv.uno_tnr
,
mlr_measures_surv.uno_tpr
,
mlr_measures_surv.xu_r2
Other Probabilistic survival measures:
mlr_measures_surv.graf
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.rcll
,
mlr_measures_surv.schmid
Other distr survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_index
,
mlr_measures_surv.dcalib
,
mlr_measures_surv.graf
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.rcll
,
mlr_measures_surv.schmid
Super classes
mlr3::Measure
-> mlr3proba::MeasureSurv
-> MeasureSurvLogloss