Towards better confidence estimation for neural models
2019
In this work we focus on confidence modeling for neural network based text classification and sequence to sequence models in the context of Natural Language Understanding (NLU) tasks. For most applications, the confidence of a neural network model in it’s output is computed as a function of the posterior probability, determined via a softmax layer. In this work, we show that such scores can be poorly calibrated [1]. We propose new ensemble and gradient based features that predict model uncertainty and confidence. We evaluate the impact of these features through a gradient boosted decision tree (GBDT) framework to produce calibrated confidence scores. We demonstrate that the performance of our proposed approach surpasses the baseline across multiple tasks. Moreover, we show that this method produces confidence scores which are better suited for Out-Of-Distribution(OOD) classification when compared to the baseline.
Research areas