This calibration method is defined by calculating $$s = B/n \sum_i (P_i - n/B)^2$$ where \(B\) is number of 'buckets', \(n\) is the number of predictions, and \(P_i\) is the predicted number of deaths in the \(i\)th interval [0, 100/B), [100/B, 50/B),....,[(B - 100)/B, 1).

A model is well-calibrated if `s ~ Unif(B)`

, tested with `chisq.test`

(`p > 0.05`

if well-calibrated).
Model `i`

is better calibrated than model `j`

if `s_i < s_j`

.

## Details

This measure can either return the test statistic or the p-value from the `chisq.test`

.
The former is useful for model comparison whereas the latter is useful for determining if a model
is well-calibration. If `chisq = FALSE`

and `m`

is the predicted value then you can manually
compute the p.value with `pchisq(m, B - 1, lower.tail = FALSE)`

.

NOTE: This measure is still experimental both theoretically and in implementation. Results should therefore only be taken as an indicator of performance and not for conclusive judgements about model calibration.

## Dictionary

This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():

```
$new()
MeasureSurvDCalibration$get("surv.dcalib")
mlr_measuresmsr("surv.dcalib")
```

## References

Haider, Humza, Hoehn, Bret, Davis, Sarah, Greiner, Russell (2020).
“Effective Ways to Build and Evaluate Individual Survival Distributions.”
*Journal of Machine Learning Research*, **21**(85), 1--63.
https://jmlr.org/papers/v21/18-772.html.

## See also

Other survival measures:
`mlr_measures_surv.calib_alpha`

,
`mlr_measures_surv.calib_beta`

,
`mlr_measures_surv.chambless_auc`

,
`mlr_measures_surv.cindex`

,
`mlr_measures_surv.graf`

,
`mlr_measures_surv.hung_auc`

,
`mlr_measures_surv.intlogloss`

,
`mlr_measures_surv.logloss`

,
`mlr_measures_surv.mae`

,
`mlr_measures_surv.mse`

,
`mlr_measures_surv.nagelk_r2`

,
`mlr_measures_surv.oquigley_r2`

,
`mlr_measures_surv.rcll`

,
`mlr_measures_surv.rmse`

,
`mlr_measures_surv.schmid`

,
`mlr_measures_surv.song_auc`

,
`mlr_measures_surv.song_tnr`

,
`mlr_measures_surv.song_tpr`

,
`mlr_measures_surv.uno_auc`

,
`mlr_measures_surv.uno_tnr`

,
`mlr_measures_surv.uno_tpr`

,
`mlr_measures_surv.xu_r2`

Other calibration survival measures:
`mlr_measures_surv.calib_alpha`

,
`mlr_measures_surv.calib_beta`

Other distr survival measures:
`mlr_measures_surv.calib_alpha`

,
`mlr_measures_surv.graf`

,
`mlr_measures_surv.intlogloss`

,
`mlr_measures_surv.logloss`

,
`mlr_measures_surv.rcll`

,
`mlr_measures_surv.schmid`

## Super classes

`mlr3::Measure`

-> `mlr3proba::MeasureSurv`

-> `MeasureSurvDCalibration`

## Methods

## Inherited methods

### Method `new()`

Creates a new instance of this R6 class.

#### Usage

`MeasureSurvDCalibration$new()`

#### Arguments

`B`

(

`integer(1)`

)

Number of buckets to test for uniform predictions over. Default of`10`

is recommended by Haider et al. (2020).`chisq`

(

`logical(1)`

)

If`TRUE`

returns the p.value of the corresponding chisq.test instead of the measure. Otherwise this can be performed manually with`pchisq(m, B - 1, lower.tail = FALSE)`

.`p > 0.05`

indicates well-calibrated.