Negative Log-Likelihood Survival Measure
Source:R/MeasureSurvLogloss.R
mlr_measures_surv.logloss.Rd
Calculates the cross-entropy, or negative log-likelihood (NLL) or logarithmic (log), loss.
Details
The Log Loss, in the context of probabilistic predictions, is defined as the negative log probability density function, \(f\), evaluated at the observation time (event or censoring), \(t\), $$L_{NLL}(f, t) = -\log[f(t)]$$
The standard error of the Log Loss, L, is approximated via, $$se(L) = sd(L)/\sqrt{N}$$ where \(N\) are the number of observations in the test set, and \(sd\) is the standard deviation.
The Re-weighted Negative Log-Likelihood (RNLL) or IPCW (Inverse Probability Censoring Weighted) Log Loss is defined by
$$L_{RNLL}(f, t, \delta) = - \frac{\delta \log[f(t)]}{G(t)}$$
where \(\delta\) is the censoring indicator and \(G(t)\) is the Kaplan-Meier estimator of the
censoring distribution.
So only observations that have experienced the event are taking into account
for RNLL (i.e. \(\delta = 1\)) and both \(f(t), G(t)\) are calculated only at the event times.
If only censored observations exist in the test set, NaN
is returned.
Dictionary
This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():
Parameters
Id | Type | Default | Levels | Range |
eps | numeric | 1e-15 | \([0, 1]\) | |
se | logical | FALSE | TRUE, FALSE | - |
IPCW | logical | TRUE | TRUE, FALSE | - |
ERV | logical | FALSE | TRUE, FALSE | - |
Parameter details
eps
(numeric(1)
)
Very small number to substitute zero values in order to prevent errors in e.g. log(0) and/or division-by-zero calculations. Default value is 1e-15.
se
(logical(1)
)
IfTRUE
then returns standard error of the measure otherwise returns the mean across all individual scores, e.g. the mean of the per observation scores. Default isFALSE
(returns the mean).
ERV
(logical(1)
)
IfTRUE
then the Explained Residual Variation method is applied, which means the score is standardized against a Kaplan-Meier baseline. Default isFALSE
.
IPCW
(logical(1)
)
IfTRUE
(default) then returns the \(L_{RNLL}\) score (which is proper), otherwise the \(L_{NLL}\) score (improper). See Sonabend et al. (2024) for more details.
Data used for Estimating Censoring Distribution
If task
and train_set
are passed to $score
then \(G(t)\) is fit using
all observations from the train set, otherwise the test set is used.
Using the train set is likely to reduce any bias caused by calculating parts of the
measure on the test data it is evaluating.
Also usually it means that more data is used for fitting the censoring
distribution \(G(t)\) via the Kaplan-Meier.
The training data is automatically used in scoring resamplings.
References
Sonabend, Raphael, Zobolas, John, Kopper, Philipp, Burk, Lukas, Bender, Andreas (2024). “Examining properness in the external validation of survival models with squared and logarithmic losses.” https://arxiv.org/abs/2212.05260v2.
See also
Other survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.calib_beta
,
mlr_measures_surv.chambless_auc
,
mlr_measures_surv.cindex
,
mlr_measures_surv.dcalib
,
mlr_measures_surv.graf
,
mlr_measures_surv.hung_auc
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.mae
,
mlr_measures_surv.mse
,
mlr_measures_surv.nagelk_r2
,
mlr_measures_surv.oquigley_r2
,
mlr_measures_surv.rcll
,
mlr_measures_surv.rmse
,
mlr_measures_surv.schmid
,
mlr_measures_surv.song_auc
,
mlr_measures_surv.song_tnr
,
mlr_measures_surv.song_tpr
,
mlr_measures_surv.uno_auc
,
mlr_measures_surv.uno_tnr
,
mlr_measures_surv.uno_tpr
,
mlr_measures_surv.xu_r2
Other Probabilistic survival measures:
mlr_measures_surv.graf
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.rcll
,
mlr_measures_surv.schmid
Other distr survival measures:
mlr_measures_surv.calib_alpha
,
mlr_measures_surv.dcalib
,
mlr_measures_surv.graf
,
mlr_measures_surv.intlogloss
,
mlr_measures_surv.rcll
,
mlr_measures_surv.schmid
Super classes
mlr3::Measure
-> mlr3proba::MeasureSurv
-> MeasureSurvLogloss