Publications
Publications sorted by year, starting with the most recent.
2026
- arXivQuantifying Epistemic Predictive Uncertainty in Conformal PredictionSiu Lun Chau, Soroush H. Zargarbashi, Yusuf Sale, and 1 more authorarXiv preprint arXiv:2602.016679, 2026
We study the problem of quantifying epistemic predictive uncertainty (EPU) – that is, uncertainty faced at prediction time due to the existence of multiple plausible predictive models – within the framework of conformal prediction (CP). To expose the implicit model multiplicity underlying CP, we build on recent results showing that, under a mild assumption, any full CP procedure induces a set of closed and convex predictive distributions, commonly referred to as a credal set. Importantly, the conformal prediction region (CPR) coincides exactly with the set of labels to which all distributions in the induced credal set assign probability at least . As our first contribution, we prove that this characterisation also holds in split CP. Building on this connection, we then propose a computationally efficient and analytically tractable uncertainty measure, based on Maximum Mean Imprecision, to quantify the EPU by measuring the degree of conflicting information within the induced credal set. Experiments on active learning and selective classification demonstrate that the quantified EPU provides substantially more informative and fine-grained uncertainty assessments than reliance on CPR size alone. More broadly, this work highlights the potential of CP serving as a principled basis for decision-making under epistemic uncertainty.
@article{Chau2026quantifyingEPU, title = {Quantifying Epistemic Predictive Uncertainty in Conformal Prediction}, author = {Chau, Siu Lun and Zargarbashi, Soroush H. and Sale, Yusuf and Caprio, Michele}, journal = {arXiv preprint arXiv:2602.016679}, year = {2026}, } - SSRNPosition: Agentic AI Systems should be making Bayes-Consistent DecisionsTheodore Papamarkou, Pierre Alquier, Matthias Bauer, and 26 more authorsSSRN Electronic Journal, 2026
LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool to call, which expert to consult, or how many resources to invest. While the usefulness and feasibility of Bayesian approaches remain unclear for LLM inference, this position paper argues that the control layer of an agentic AI system (that orchestrates LLMs and tools) is a clear case where Bayesian principles should shine. Bayesian decision theory provides a framework for agentic systems that can help to maintain beliefs over task-relevant latent quantities, to update these beliefs from observed agentic and human-AI interactions, and to choose actions. Making LLMs themselves explicitly Bayesian belief-updating engines remains computationally intensive and conceptually nontrivial as a general modeling target. In contrast, this paper argues that coherent decision-making requires Bayesian principles at the level of the agentic system, not necessarily the LLM agent parameters. This paper articulates practical properties for Bayesian control that fit modern agentic AI systems and human-AI collaboration, and provides concrete examples and design patterns to illustrate how calibrated beliefs and utility-aware policies can improve agentic AI orchestration.
@article{positions2026, title = {Position: Agentic AI Systems should be making Bayes-Consistent Decisions}, author = {Papamarkou, Theodore and Alquier, Pierre and Bauer, Matthias and Buntine, Wray and Davison, Andrew and Dziugaite, Gintare Karolina and Filippone, Maurizio and Foong, Andrew Y. K. and Fortuin, Vincent and Fouskakis, Dimitris and H{\"u}llermeier, Eyke and Karaletsos, Theofanis and Khan, Mohammad Emtiyaz and Kotelevskii, Nikita and Lahlou, Salem and Li, Yingzhen and Liu, Fang and Lyle, Clare and M{\"o}llenhoff, Thomas and Palla, Konstantina and Panov, Maxim and Sale, Yusuf and Schweighofer, Kajetan and Shelmanov, Artem and Swaroop, Siddharth and Trapp, Martin and Waegeman, Willem and Wilson, Andrew Gordon and Zaytsev, Alexey}, journal = {SSRN Electronic Journal}, year = {2026}, } - ICLREfficient Credal Prediction through DecalibrationPaul Hofman, Timo Löhr, Maximilian Muschalik, and 2 more authorsIn The Fourteenth International Conference on Learning Representations, 2026
A reliable representation of uncertainty is essential for the application of modern machine learning methods in safety-critical settings. In this regard, the use of credal sets (i.e., convex sets of probability distributions) has recently been proposed as a suitable approach to representing epistemic uncertainty. However, as with other approaches to epistemic uncertainty, training credal predictors is computationally complex and usually involves (re-)training an ensemble of models. The resulting computational complexity prevents their adoption for complex models such as foundation models and multi-modal systems. To address this problem, we propose an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers. For each class label, our method predicts a range of plausible probabilities in the form of an interval. To produce the lower and upper bounds of these intervals, we propose a technique that we refer to as decalibration. Extensive experiments show that our method yields credal sets with strong performance across diverse tasks, including coverage–efficiency evaluation, out-of-distribution detection, and in-context learning. Notably, we demonstrate credal prediction on models such as TabPFN and CLIP—architectures for which the construction of credal sets was previously infeasible.
@inproceedings{hofman2026efficient, title = {Efficient Credal Prediction through Decalibration}, author = {Hofman, Paul and L{\"o}hr, Timo and Muschalik, Maximilian and Sale, Yusuf and H{\"u}llermeier, Eyke}, booktitle = {The Fourteenth International Conference on Learning Representations}, year = {2026}, } - AISTATSConformal Prediction in Hierarchical Classification with Constrained Representation ComplexityThomas Mortier, Alireza Javanmardi, Yusuf Sale, and 2 more authorsIn The 29th International Conference on Artificial Intelligence and Statistics, 2026
Conformal prediction has emerged as a widely used framework for constructing valid prediction sets in classification and regression tasks. In this work, we extend the split conformal prediction framework to hierarchical classification, where prediction sets are commonly restricted to internal nodes of a predefined hierarchy, and propose two computationally efficient inference algorithms. The first algorithm returns internal nodes as prediction sets, while the second relaxes this restriction, using the notion of representation complexity, yielding a more general and combinatorial inference problem, but smaller set sizes. Empirical evaluations on several benchmark datasets demonstrate the effectiveness of the proposed algorithms in achieving nominal coverage.
@inproceedings{mortier2025hierachical, title = {Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity}, author = {Mortier, Thomas and Javanmardi, Alireza and Sale, Yusuf and H{\"u}llermeier, Eyke and Waegeman, Willem}, booktitle = {The 29th International Conference on Artificial Intelligence and Statistics}, year = {2026}, } - AAAIUncertainty Quantification for Machine Learning: One Size Does Not Fit AllPaul Hofman, Yusuf Sale, and Eyke HüllermeierProceedings of the AAAI Conference on Artificial Intelligence, 40, 2026
Proper quantification of predictive uncertainty is essential for the use of machine learning in safety-critical applications. Various uncertainty measures have been proposed for this purpose, typically claiming superiority over other measures. In this paper, we argue that there is no single best measure. Instead, uncertainty quantification should be tailored to the specific application. To this end, we use a flexible family of uncertainty measures that distinguishes between total, aleatoric, and epistemic uncertainty of second-order distributions. These measures can be instantiated with specific loss functions, so-called proper scoring rules, to control their characteristics, and we show that different characteristics are useful for different tasks. In particular, we show that, for the task of selective prediction, the scoring rule should ideally match the task loss. On the other hand, for out-of-distribution detection, our results confirm that mutual information, a widely used measure of epistemic uncertainty, performs best. Furthermore, in an active learning setting, epistemic uncertainty based on zero-one loss is shown to consistently outperform other uncertainty measures.
@article{hofman2025size, title = {Uncertainty Quantification for Machine Learning: One Size Does Not Fit All}, author = {Hofman, Paul and Sale, Yusuf and Hüllermeier, Eyke}, journal = {Proceedings of the AAAI Conference on Artificial Intelligence, 40}, year = {2026}, }
2025
- arXivUncertainty Quantification for Regression: A Unified Framework based on kernel scoresChristopher Bülte, Yusuf Sale, Gitta Kutyniok, and 1 more authorarXiv preprint arXiv:2510.25599, 2025
Regression tasks, notably in safety-critical domains, require proper uncertainty quantification, yet the literature remains largely classification-focused. In this light, we introduce a family of measures for total, aleatoric, and epistemic uncertainty based on proper scoring rules, with a particular emphasis on kernel scores. The framework unifies several well-known measures and provides a principled recipe for designing new ones whose behavior, such as tail sensitivity, robustness, and out-of-distribution responsiveness, is governed by the choice of kernel. We prove explicit correspondences between kernel-score characteristics and downstream behavior, yielding concrete design guidelines for task-specific measures. Extensive experiments demonstrate that these measures are effective in downstream tasks and reveal clear trade-offs among instantiations, including robustness and out-of-distribution detection performance.
@article{buelte2025kernel, title = {Uncertainty Quantification for Regression: A Unified Framework based on kernel scores}, author = {Bülte, Christopher and Sale, Yusuf and Kutyniok, Gitta and Hüllermeier, Eyke}, journal = {arXiv preprint arXiv:2510.25599}, year = {2025}, } - EGUValid Prediction Intervals for Weather Forecasting with Conformal PredictionThomas Mortier, Cas Decancq, Yusuf Sale, and 4 more authorsIn EGU General Assembly 2025, 2025
In recent years, machine learning has emerged as a promising alternative to numerical weather prediction models, offering the potential for cost-effective and accurate forecasts. However, a significant limitation of current machine learning methods for weather forecasting is the lack of principled and efficient uncertainty quantification—a key element given the complexity of the Earth’s climate system and the challenges in modeling its processes and feedback mechanisms. Inadequate uncertainty quantification and reporting undermines trust in and the practical use of current weather forecasting methods (Eyring et al., 2024). Uncertainty quantification methods for weather forecasting typically use prediction intervals and can be categorized into Bayesian and frequentist approaches. Bayesian methods, while theoretically appealing, often involve restrictive assumptions and do not scale well to the complexity of spatio-temporal data. Frequentist approaches, such as ensemble-based methods, are widely used in weather forecasting and include techniques like perturbing initial states with noise (Bi et al., 2023; Scher et al., 2021), varying neural network parameters (Graubner et al., 2022), or training generative models (Price et al., 2023). However, most frequentist methods provide only asymptotically valid prediction intervals, which may not suffice in all weather forecasting applications. Conformal prediction (CP) is a promising uncertainty quantification framework that delivers valid and efficient prediction intervals for any learning algorithm, without requiring assumptions about the underlying data distribution (Vovk et al., 2005). Despite its growing popularity in the machine learning and statistics communities, traditional CP methods are not tailored to spatio-temporal data in weather forecasting. This is due to challenges arising from spatial and temporal dependencies—such as spatial autocorrelation and temporal dynamics—that violate the exchangeability assumption underlying standard CP methods. Several recent studies attempted to address these challenges by introducing new CP algorithms specifically designed for various types of non-exchangeability (Oliveira et al., 2024). However, these adaptations face several limitations, including high computational complexity, asymptotic guarantees, and/or the need for recalibration of prediction intervals. In this presentation, we will evaluate CP methods in the context of weather forecasting and discuss several limitations. In addition, we will highlight recent advances and discuss potential future directions that could address challenges underlying the use of CP in weather forecasting.
@inproceedings{mortier2025forecasting, title = {Valid Prediction Intervals for Weather Forecasting with Conformal Prediction}, author = {Mortier, Thomas and Decancq, Cas and Sale, Yusuf and Javanmardi, Alireza and Waegeman, Willem and H{\"u}llermeier, Eyke and Miralles, Diego}, booktitle = {EGU General Assembly 2025}, year = {2025}, } - COPAAleatoric and Epistemic Uncertainty in Conformal PredictionYusuf Sale, Alireza Javanmardi, and Eyke HüllermeierIn Symposium on Conformal and Probabilistic Prediction with Applications, 2025
Recently, there has been a particular interest in distinguishing different types of uncertainty in supervised machine learning (ML) settings (Hullermeier and Waegeman, 2021). Aleatoric uncertainty captures the inherent randomness in the data-generating process. As it represents variability that cannot be reduced even with more data, it is often referred to as irreducible uncertainty. In contrast, epistemic uncertainty arises from a lack of knowledge about the underlying data-generating process, which–in principle–can be reduced by acquiring additional data or improving the model itself (viz. reducible uncertainty). In parallel, interest in conformal prediction (CP)–both its theory and applications–has become equally vigorous. Conformal Prediction (Vovk et al., 2005) is a model-agnostic framework for uncertainty quantification that provides prediction sets or intervals with rigorous statistical coverage guarantees. Notably, CP is distribution-free and makes only the mild assumption of exchangeability. Under this assumption, it yields prediction intervals that contain the true label with a user-specified probability. Thus, CP is seen as a promising tool to quantify uncertainty. But how is it related to aleatoric and epistemic uncertainty? In particular, we first analyze how (estimates of) aleatoric and epistemic uncertainty enter into the construction of vanilla CP–that is, how noise and model error jointly shape the global threshold. We then review “uncertainty-aware” extensions that integrate these uncertainty estimates into the CP pipeline.
@inproceedings{javanmardi2025conformal, title = {Aleatoric and Epistemic Uncertainty in Conformal Prediction}, author = {Sale, Yusuf and Javanmardi, Alireza and H{\"u}llermeier, Eyke}, booktitle = {Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {784--786}, year = {2025}, organization = {PMLR}, } - UAIConformal Prediction without Nonconformity ScoresJonas Hanselle, Alireza Javanmardi, Tobias Florin Oberkofler, and 2 more authorsIn The 41st Conference on Uncertainty in Artificial Intelligence, 2025
Conformal prediction (CP) is an uncertainty quantification framework that allows for constructing statistically valid prediction sets. Key to the construction of these sets is the notion of a nonconformity function, which assigns a real-valued score to individual data points: only those (hypothetical) data points contribute to a prediction set that sufficiently conform to the data. The point of departure of this work is the observation that CP predictions are invariant against (strictly) monotone transformations of the nonconformity function. In other words, it is only the ordering of the scores that matters, not their quantitative values. Consequently, instead of scoring individual data points, a conformal predictor only needs to be able to compare pairs of data points, deciding which of them is the more conforming one. This suggests an interesting connection between CP and preference learning, in particular learning-to-rank methods, and makes CP amenable to training data in the form of (qualitative) preferences. Elaborating on this connection, we propose methods for preference-based CP and show their usefulness in real-world classification tasks.
@inproceedings{hanselle2025nonconformity, title = {{Conformal Prediction without Nonconformity Scores}}, author = {Hanselle, Jonas and Javanmardi, Alireza and Oberkofler, Tobias Florin and Sale, Yusuf and H{\"u}llermeier, Eyke}, booktitle = {The 41st Conference on Uncertainty in Artificial Intelligence}, pages = {2282--2292}, year = {2025}, organization = {PMLR}, } - arXivUncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction TasksPaul Hofman, Yusuf Sale, and Eyke HüllermeierarXiv preprint arXiv:2505.22538, 2025
We address the problem of uncertainty quantification and propose measures of total, aleatoric, and epistemic uncertainty based on a known decomposition of (strictly) proper scoring rules, a specific type of loss function, into a divergence and an entropy component. This leads to a flexible framework for uncertainty quantification that can be instantiated with different losses (scoring rules), which makes it possible to tailor uncertainty quantification to the use case at hand. We show that this flexibility is indeed advantageous. In particular, we analyze the task of selective prediction and show that the scoring rule should ideally match the task loss. In addition, we perform experiments on two other common tasks. For out-of-distribution detection, our results confirm that a widely used measure of epistemic uncertainty, mutual information, performs best. Moreover, in the setting of active learning, our measure of epistemic uncertainty based on the zero-one-loss consistently outperforms other uncertainty measures.
@article{hofmann2025properscoring, title = {Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks}, author = {Hofman, Paul and Sale, Yusuf and H{\"u}llermeier, Eyke}, journal = {arXiv preprint arXiv:2505.22538}, year = {2025}, } - arXivAn Axiomatic Assessment of Entropy- and Variance-based Uncertainty Quantification in RegressionChristopher Bülte, Yusuf Sale, Timo Löhr, and 3 more authorsarXiv preprint arXiv:2504.18433, 2025
Uncertainty quantification (UQ) is crucial in machine learning, yet most (axiomatic) studies of uncertainty measures focus on classification, leaving a gap in regression settings with limited formal justification and evaluations. In this work, we introduce a set of axioms to rigorously assess measures of aleatoric, epistemic, and total uncertainty in supervised regression. By utilizing a predictive exponential family, we can generalize commonly used approaches for uncertainty representation and corresponding uncertainty measures. More specifically, we analyze the widely used entropy- and variance-based measures regarding limitations and challenges. Our findings provide a principled foundation for uncertainty quantification in regression, offering theoretical insights and practical guidelines for reliable uncertainty assessment.
@article{buelte2025axiomatic, title = {An Axiomatic Assessment of Entropy- and Variance-based Uncertainty Quantification in Regression}, author = {B{\"u}lte, Christopher and Sale, Yusuf and L{\"o}hr, Timo and Hofman, Paul and Kutyniok, Gitta and H{\"u}llermeier, Eyke}, journal = {arXiv preprint arXiv:2504.18433}, year = {2025}, } - TMLROnline Selective Conformal Inference: Errors and SolutionsYusuf Sale, and Aaditya RamdasTransactions on Machine Learning Research, 2025
In online selective conformal inference, data arrives sequentially, and prediction intervals are constructed only when an online selection rule is met. Since online selections may break the exchangeability between the selected test datum and the rest of the data, one must correct for this by suitably selecting the calibration data. In this paper, we evaluate existing calibration selection strategies and pinpoint some fundamental errors in the associated claims that guarantee selection-conditional coverage and control of the false coverage rate (FCR). To address these shortcomings, we propose novel calibration selection strategies that provably preserve the exchangeability of the calibration data and the selected test datum. Consequently, we demonstrate that online selective conformal inference with these strategies guarantees both selection-conditional coverage and FCR control. Our theoretical findings are supported by experimental evidence examining trade-offs between valid methods.
@article{sale2025online, title = {Online Selective Conformal Inference: Errors and Solutions}, author = {Sale, Yusuf and Ramdas, Aaditya}, journal = {Transactions on Machine Learning Research}, year = {2025}, } - ISIPTAConformal Prediction Regions are Imprecise Highest Density RegionsMichele Caprio, Yusuf Sale, and Eyke HüllermeierIn Proceedings of the Fourteenth International Symposium on Imprecise Probabilities: Theories and Applications, 2025
Recently, Cella and Martin proved how, under an assumption called consonance, a credal set (i.e. a closed and convex set of probabilities) can be derived from the conformal transducer associated with transductive conformal prediction. We show that the Imprecise Highest Density Region (IHDR) associated with such a credal set corresponds to the classical Conformal Prediction Region. In proving this result, we establish a new relationship between Conformal Prediction and Imprecise Probability (IP) theories, via the IP concept of a cloud. A byproduct of our presentation is the discovery that consonant plausibility functions are monoid homomorphisms, a new algebraic property of an IP tool.
@inproceedings{caprio2025conformal, title = {Conformal Prediction Regions are Imprecise Highest Density Regions}, author = {Caprio, Michele and Sale, Yusuf and H{\"u}llermeier, Eyke}, booktitle = {Proceedings of the Fourteenth International Symposium on Imprecise Probabilities: Theories and Applications}, pages = {47--59}, year = {2025}, organization = {PMLR}, }
2024
- UAILabel-wise Aleatoric and Epistemic Uncertainty QuantificationYusuf Sale, Paul Hofman, Timo Löhr, and 3 more authorsIn The 40th Conference on Uncertainty in Artificial Intelligence, 2024
We present a novel approach to uncertainty quantification in classification tasks based on label-wise decomposition of uncertainty measures. This label-wise perspective allows uncertainty to be quantified at the individual class level, thereby improving cost-sensitive decision-making and helping understand the sources of uncertainty. Furthermore, it allows to define total, aleatoric, and epistemic uncertainty on the basis of non-categorical measures such as variance, going beyond common entropy-based measures. In particular, variance-based measures address some of the limitations associated with established methods that have recently been discussed in the literature. We show that our proposed measures adhere to a number of desirable properties. Through empirical evaluation on a variety of benchmark data sets – including applications in the medical domain where accurate uncertainty quantification is crucial – we establish the effectiveness of label-wise uncertainty quantification.
@inproceedings{salelabel, title = {{Label-wise Aleatoric and Epistemic Uncertainty Quantification}}, author = {Sale, Yusuf and Hofman, Paul and L{\"o}hr, Timo and Wimmer, Lisa and Nagler, Thomas and H{\"u}llermeier, Eyke}, booktitle = {The 40th Conference on Uncertainty in Artificial Intelligence}, year = {2024}, } - arXivQuantifying Aleatoric and Epistemic Uncertainty with Proper Scoring RulesPaul Hofman, Yusuf Sale, and Eyke HüllermeierarXiv preprint arXiv:2404.12215, 2024
Uncertainty representation and quantification are paramount in machine learning and constitute an important prerequisite for safety-critical applications. In this paper, we propose novel measures for the quantification of aleatoric and epistemic uncertainty based on proper scoring rules, which are loss functions with the meaningful property that they incentivize the learner to predict ground-truth (conditional) probabilities. We assume two common representations of (epistemic) uncertainty, namely, in terms of a credal set, i.e. a set of probability distributions, or a second-order distribution, i.e., a distribution over probability distributions. Our framework establishes a natural bridge between these representations. We provide a formal justification of our approach and introduce new measures of epistemic and aleatoric uncertainty as concrete instantiations.
@article{hofman2024quantifying, title = {Quantifying Aleatoric and Epistemic Uncertainty with Proper Scoring Rules}, author = {Hofman, Paul and Sale, Yusuf and H{\"u}llermeier, Eyke}, journal = {arXiv preprint arXiv:2404.12215}, year = {2024}, } - arXivExplaining Bayesian Optimization by Shapley Values Facilitates Human-AI CollaborationJulian Rodemann, Federico Croppi, Philipp Arens, and 7 more authorsarXiv preprint arXiv:2403.04629, 2024
Bayesian optimization (BO) with Gaussian processes (GP) has become an indispensable algorithm for black box optimization problems. Not without a dash of irony, BO is often considered a black box itself, lacking ways to provide reasons as to why certain parameters are proposed to be evaluated. This is particularly relevant in human-in-the-loop applications of BO, such as in robotics. We address this issue by proposing ShapleyBO, a framework for interpreting BO’s proposals by game-theoretic Shapley values.They quantify each parameter’s contribution to BO’s acquisition function. Exploiting the linearity of Shapley values, we are further able to identify how strongly each parameter drives BO’s exploration and exploitation for additive acquisition functions like the confidence bound. We also show that ShapleyBO can disentangle the contributions to exploration into those that explore aleatoric and epistemic uncertainty. Moreover, our method gives rise to a ShapleyBO-assisted human machine interface (HMI), allowing users to interfere with BO in case proposals do not align with human reasoning. We demonstrate this HMI’s benefits for the use case of personalizing wearable robotic devices (assistive back exosuits) by human-in-the-loop BO. Results suggest human-BO teams with access to ShapleyBO can achieve lower regret than teams without.
@article{rodemann2024explaining, title = {Explaining Bayesian Optimization by Shapley Values Facilitates Human-AI Collaboration}, author = {Rodemann, Julian and Croppi, Federico and Arens, Philipp and Sale, Yusuf and Herbinger, Julia and Bischl, Bernd and H{\"u}llermeier, Eyke and Augustin, Thomas and Walsh, Conor J and Casalicchio, Giuseppe}, journal = {arXiv preprint arXiv:2403.04629}, year = {2024}, } - SPIGMQuantifying Aleatoric and Epistemic Uncertainty: A Credal ApproachPaul Hofman, Yusuf Sale, and Eyke HüllermeierIn ICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling, 2024
Uncertainty representation and quantification are paramount in machine learning, especially in safety-critical applications. In this paper, we propose a novel framework for the quantification of aleatoric and epistemic uncertainty based on the notion of credal sets, i.e., sets of probability distributions. Thus, we assume a learner that produces (second-order) predictions in the form of sets of probability distributions on outcomes. Practically, such an approach can be realized by means of ensemble learning: Given an ensemble of learners, credal sets are generated by including sufficiently plausible predictors, where plausibility is mea- sured in terms of (relative) likelihood. We provide a formal justification for the framework and introduce new measures of epistemic and aleatoric un- certainty as concrete instantiations. We evaluate these measures both theoretically, by analyzing desirable axiomatic properties, and empirically, by comparing them in terms of performance and effectiveness to existing measures of uncertainty in an experimental study.
@inproceedings{hofman2024quantifyinh, title = {Quantifying Aleatoric and Epistemic Uncertainty: A Credal Approach}, author = {Hofman, Paul and Sale, Yusuf and H{\"u}llermeier, Eyke}, booktitle = {ICML 2024 Workshop on Structured Probabilistic Inference {\&} Generative Modeling}, year = {2024}, } - ICMLSecond-Order Uncertainty Quantification: A Distance-Based ApproachYusuf Sale, Viktor Bengs, Michele Caprio, and 1 more authorIn Forty-first International Conference on Machine Learning, 2024
In the past couple of years, various approaches to representing and quantifying different types of predictive uncertainty in machine learning, notably in the setting of classification, have been proposed on the basis of second-order probability distributions, i.e., predictions in the form of distributions on probability distributions. A completely conclusive solution has not yet been found, however, as shown by recent criticisms of commonly used uncertainty measures associated with second-order distributions, identifying undesirable theoretical properties of these measures. In light of these criticisms, we propose a set of formal criteria that meaningful uncertainty measures for predictive uncertainty based on second-order distributions should obey. Moreover, we provide a general framework for developing uncertainty measures to account for these criteria, and offer an instantiation based on the Wasserstein distance, for which we prove that all criteria are satisfied.
@inproceedings{sale2024secondorder, title = {Second-Order Uncertainty Quantification: A Distance-Based Approach}, author = {Sale, Yusuf and Bengs, Viktor and Caprio, Michele and H{\"u}llermeier, Eyke}, booktitle = {Forty-first International Conference on Machine Learning}, year = {2024}, }
2023
- arXivSecond-Order Uncertainty Quantification: Variance-Based MeasuresYusuf Sale, Paul Hofman, Lisa Wimmer, and 2 more authorsarXiv preprint arXiv:2401.00276, 2023
Uncertainty quantification is a critical aspect of machine learning models, providing important insights into the reliability of predictions and aiding the decision-making process in real-world applications. This paper proposes a novel way to use variance-based measures to quantify uncertainty on the basis of second-order distributions in classification problems. A distinctive feature of the measures is the ability to reason about uncertainties on a class-based level, which is useful in situations where nuanced decision-making is required. Recalling some properties from the literature, we highlight that the variance-based measures satisfy important (axiomatic) properties. In addition to this axiomatic approach, we present empirical results showing the measures to be effective and competitive to commonly used entropy-based measures.
@article{sale2023second, title = {Second-Order Uncertainty Quantification: Variance-Based Measures}, author = {Sale, Yusuf and Hofman, Paul and Wimmer, Lisa and H{\"u}llermeier, Eyke and Nagler, Thomas}, journal = {arXiv preprint arXiv:2401.00276}, year = {2023}, } - COPAConformal Prediction with Partially Labeled DataAlireza Javanmardi, Yusuf Sale, Paul Hofman, and 1 more authorIn Conformal and Probabilistic Prediction with Applications, 2023
While the predictions produced by conformal prediction are set-valued, the data used for training and calibration is supposed to be precise. In the setting of superset learning or learning from partial labels, a variant of weakly supervised learning, it is exactly the other way around: training data is possibly imprecise (set-valued), but the model induced from this data yields precise predictions. In this paper, we combine the two settings by making conformal prediction amenable to set-valued training data. We propose a generalization of the conformal prediction procedure that can be applied to set-valued training and calibration data. We prove the validity of the proposed method and present experimental studies in which it compares favorably to natural baselines.
@inproceedings{javanmardi2023conformal, title = {Conformal Prediction with Partially Labeled Data}, author = {Javanmardi, Alireza and Sale, Yusuf and Hofman, Paul and H{\"u}llermeier, Eyke}, booktitle = {Conformal and Probabilistic Prediction with Applications}, pages = {251--266}, year = {2023}, organization = {PMLR}, } - Epi UAIA Novel Bayes’ Theorem for Upper ProbabilitiesMichele Caprio, Yusuf Sale, Eyke Hüllermeier, and 1 more authorIn International Workshop on Epistemic Uncertainty in Artificial Intelligence, 2023
In their seminal 1990 paper, Wasserman and Kadane establish an upper bound for the Bayes’ posterior probability of a measurable set, when the prior lies in a class of probability measures and the likelihood is precise. They also give a sufficient condition for such upper bound to hold with equality. In this paper, we introduce a generalization of their result by additionally addressing uncertainty related to the likelihood. We give an upper bound for the posterior probability when both the prior and the likelihood belong to a set of probabilities. Furthermore, we give a sufficient condition for this upper bound to become an equality. This result is interesting on its own, and has the potential of being applied to various fields of engineering (e.g. model predictive control), machine learning, and artificial intelligence.
@inproceedings{caprio2023novel, title = {A Novel Bayes’ Theorem for Upper Probabilities}, author = {Caprio, Michele and Sale, Yusuf and H{\"u}llermeier, Eyke and Lee, Insup}, booktitle = {International Workshop on Epistemic Uncertainty in Artificial Intelligence}, pages = {1--12}, year = {2023}, organization = {Springer}, } - UAIQuantifying Aleatoric and Epistemic Uncertainty in Machine Learning: Are Conditional Entropy and Mutual Information Appropriate Measures?Lisa Wimmer, Yusuf Sale, Paul Hofman, and 2 more authorsIn The 39th Conference on Uncertainty in Artificial Intelligence, 2023
The quantification of aleatoric and epistemic uncertainty in terms of conditional entropy and mutual information, respectively, has recently become quite common in machine learning. While the properties of these measures, which are rooted in information theory, seem appealing at first glance, we identify various incoherencies that call their appropriateness into question. In addition to the measures themselves, we critically discuss the idea of an additive decomposition of total uncertainty into its aleatoric and epistemic constituents. Experiments across different computer vision tasks support our theoretical findings and raise concerns about current practice in uncertainty quantification.
@inproceedings{wimmer2023quantifying, title = {{Quantifying Aleatoric and Epistemic Uncertainty in Machine Learning: Are Conditional Entropy and Mutual Information Appropriate Measures?}}, author = {Wimmer, Lisa and Sale, Yusuf and Hofman, Paul and Bischl, Bernd and H{\"u}llermeier, Eyke}, booktitle = {The 39th Conference on Uncertainty in Artificial Intelligence}, pages = {2282--2292}, year = {2023}, organization = {PMLR}, } - UAIIs the Volume of a Credal Set a Good Measure for Epistemic Uncertainty?Yusuf Sale, Michele Caprio, and Eyke HüllermeierIn The 39th Conference on Uncertainty in Artificial Intelligence, 2023
Adequate uncertainty representation and quantification have become imperative in various scientific disciplines, especially in machine learning and artificial intelligence. As an alternative to representing uncertainty via one single probability measure, we consider credal sets (convex sets of probability measures). The geometric representation of credal sets as d-dimensional polytopes implies a geometric intuition about (epistemic) uncertainty. In this paper, we show that the volume of the geometric representation of a credal set is a meaningful measure of epistemic uncertainty in the case of binary classification, but less so for multi-class classification. Our theoretical findings highlight the crucial role of specifying and employing uncertainty measures in machine learning in an appropriate way, and for being aware of possible pitfalls.
@inproceedings{sale2023volume, title = {Is the Volume of a Credal Set a Good Measure for Epistemic Uncertainty?}, author = {Sale, Yusuf and Caprio, Michele and H{\"u}llermeier, Eyke}, booktitle = {The 39th Conference on Uncertainty in Artificial Intelligence}, pages = {1795--1804}, year = {2023}, organization = {PMLR}, }
2022
- THESISG-Framework in StatisticsYusuf Sale2022
In order to achieve reliable results via statistical methodology, one important goal is to account for potential uncertainty. Shige Peng introduced an un- certainty counterpart of Kolmogorov’s probabilistic setting: the G-Framework. While this framework is well-known in mathematical finance, work within the G-Framework in statistics is limited. This thesis motivates nonlinear expectations for decision-making under uncertainty in dynamic and non-dynamic situations. Switching the viewpoint from probability spaces to expectation spaces, we discuss the theoretical foundations of the G-Framework, emphasizing comprehensibility. We motivate nonlinear expectations for subsequent application in statistics by notions that emerged in various academic communities and are like- wise concerned with decision-making under uncertainty: Choquet expectations express probabilistic uncertainty from the viewpoint of non-additive measures and g-expectations, which represent a nonlinear class of expectations based on backward stochastic differential equations (BSDE). For explicit understanding, we provide the required foundations of stochastic calculus in a self-contained form. The applicability of the G-Framework in statistics is particularly evident from the respective Law of Large Numbers and Central Limit Theorem. To emphasize the applicability, this thesis motivates a notion of sublinear regression.
@article{sale2022g, title = {G-Framework in Statistics}, author = {Sale, Yusuf}, year = {2022}, }
2019
- THESISRobuste Minimax-Tests und eine Robustifizierung des Bayes-FaktorsYusuf Sale2019
Die robuste Statistik befasst sich mit statistischen Verfahren, die mögliche Abweichungen von zugrundeliegenden Modellannahmen berücksichtigen. Insbesondere stellt man fest, dass der Likelihood-Quotienten-Test im Allgemeinen nicht robust gegenüber Abweichungen von zugrundeliegenden Verteilungsannahmen ist. Dahingehend wird in der vorliegenden Arbeit eine robuste Version des Likelihood-Quotienten-Tests mathematisch über Umgebungsmodelle und Least Favorable Pair motiviert und vorgestellt. Dabei stellt man fest, dass der gestutzte Likelihood-Quotienten-Tests über wünschenswerte minimax Eigenschaften besitzt, weshalb man in diesem Kontext auch von robusten Minimax-Tests spricht. Diese Minimax-Resultate werden anhand des Huber-Strassen-Theorems auf bialternierende Kapazitäten in polnischen Räumen generalisiert. Das Huber-Strassen-Theorem und eben- so die damit verbundene weitreichende Bedeutung für die robuste Statistik werden in der vorliegenden Arbeit thematisiert. Neben mathematischer Ausführlichkeit in der Beweisführung und Veranschaulichung methodologischer Konstrukte, wird zudem versucht die praktische Relevanz solcher robusten Verfahren zu veranschaulichen. Um die Methodik robuster Verfahren ebenso in den bayesianischen Rahmen zu übertragen, beschäftigt sich die Arbeit weiterhin mit dem Bayes-Faktor, der eine Alternative zu klassischen Hypothesentests darstellt. Anhand der erzielten Ergebnisse wird eine robuste Version des Bayes-Faktors vorgeschlagen und ebenso verdeutlicht, wieso eine Robustifizierung sich als besonders schwierig herausstellt.
@article{sale2019robuste, title = {Robuste Minimax-Tests und eine Robustifizierung des Bayes-Faktors}, author = {Sale, Yusuf}, year = {2019}, }