Probabilistic Supervised Learning for mlr3.
mlr3proba is a machine learning toolkit for making probabilistic predictions within the mlr3 ecosystem. It currently supports the following tasks:
Key features of mlr3proba are
mlr3proba makes use of the distr6 probability distribution interface as its probabilistic predictive return type.
The current mlr3proba release focuses on survival analysis, and contains:
The vision of mlr3proba is to provide comprehensive machine learning functionality to the mlr3 ecosystem for continuous probabilistic return types.
The lifecycle of the survival task and features are considered
maturing and any major changes are unlikely.
The density and probabilistic supervised regression tasks are currently in the early stages of development. Task frameworks have been drawn up, but may not be stable; learners need to be interfaced, and contributions are very welcome (see issues).
Install the last release from CRAN:
Install the development version from GitHub:
Core learners are implemented in mlr3proba, recommended common learners are implemented in mlr3learners, and many more are implemented in mlr3extralearners. Use the interactive search table to search for available learners and see the learner status page for their live status.
For density estimation only the log-loss is currently implemented, for survival analysis, the following measures are implemented:
|surv.calib_alpha||van Houwelingen’s Alpha Calibration||mlr3proba|
|surv.calib_beta||van Houwelingen’s Beta Calibration||mlr3proba|
|surv.chambless_auc||Chambless and Diao’s AUC||survAUC|
|surv.graf||Integrated Graf Score||mlr3proba|
|surv.hungAUC||Hung and Chiang’s AUC||survAUC|
|surv.intlogloss||Integrated Log Loss||mlr3proba|
|surv.oquigley_r2||O’Quigley, Xu, and Stare’s R2||survAUC|
|surv.song_auc||Song and Zhou’s AUC||survAUC|
|surv.song_tnr||Song and Zhou’s TNR||survAUC|
|surv.song_tpr||Song and Zhou’s TPR||survAUC|
|surv.xu_r2||Xu and O’Quigley’s R2||survAUC|
probpredict type to
TaskRegr, and associated learners/measures
MeasureSurvto return measures at multiple time-points simultaneously
mlr3proba is a free and open source software project that encourages participation and feedback. If you have any issues, questions, suggestions or feedback, please do not hesitate to open an “issue” about it on the GitHub page!
In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behaviour (but don’t worry about this if the bug is obvious).
Predecessors to this package are previous instances of survival modelling in mlr. The skpro package in the python/scikit-learn ecosystem follows a similar interface for probabilistic supervised learning and is an architectural predecessor. Several packages exist which allow probabilistic predictive modelling with a Bayesian model specific general interface, such as rjags and stan. For implementation of a few survival models and measures, a central package is survival. There does not appear to be a package that provides an architectural framework for distribution/density estimation, see this list for a review of density estimation packages in R.
Several people contributed to the building of
mlr3proba. Firstly, thanks to Michel Lang for writing
mlr3survival. Several learners and measures implemented in
mlr3proba, as well as the prediction, task, and measure surv objects, were written initially in
mlr3survival before being absorbed into
mlr3proba. Secondly thanks to Franz Kiraly for major contributions towards the design of the proba-specific parts of the package, including compositors and predict types. Also for mathematical contributions towards the scoring rules implemented in the package. Finally thanks to Bernd Bischl and the rest of the mlr core team for building
mlr3 and for many conversations about the design of