Towards calibrated multi-label deep neural networks


University of California, San Diego
Published in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Abstract


The problem of calibrating deep neural networks (DNNs) for multi-label learning is considered. It is wellknown that DNNs trained by cross-entropy for single-label, or one-hot, classification are poorly calibrated. Many calibration techniques have been proposed to address the problem. However, little attention has been paid to the calibration of multi-label DNNs. In this literature, the focus has been on improving labeling accuracy in the face of severe dataset unbalance. This is addressed by the introduction of asymmetric losses, which have became very popular. However, these losses do not induce well calibrated classifiers. In this work, we first provide a theoretical explanation for this poor calibration performance, by showing that these loses losses lack the strictly proper property, a necessary condition for accurate probability estimation. To overcome this problem, we propose a new Strictly Proper Asymmetric (SPA) loss. This is complemented by a Label Pair Regularizer (LPR) that increases the number of calibration constraints introduced per training example. The effectiveness of both contributions is validated by extensive experiments on various multi-label datasets. The resulting training method is shown to significantly decrease the calibration error while maintaining state-of-the-art accuracy.

Methodology


Multi-label CPE losses discussed in this work. The last column indicates whether there is a bijective map between the true class-posterior probability η and the CPE risk minimizer.

Results


paper

Left: Reliability diagram (calibration curve) of multi-label DNNs trained with the asymmetric focal loss [40], ASY loss [61], and our proposed loss. Right: corresponding retrieval results on the multi-label retrieval task, where the user specifies a query string of desired labels P and undesired labels N. Correct retrieval results are highlighted in green. Improved calibration substantially improves retrieval performance.

Poster


paper

Paper


PDF

Supplement

Code

Bibtex

Acknowledgements

This work was partially funded by NSF awards IIS1924937 and IIS-2041009, a gift from Amazon, a gift from Qualcomm, and NVIDIA GPU donations. We also acknowledge and thank the use of the Nautilus platform for some of the experiments discussed above.