Establishing reliability in crowdsourced voice feedback (CVF) and evaluating the reliability in the absence of ground-truth
Intelligent Voice Assistant (IVA) systems, such as Alexa, Google Assistant and Siri, allow us to interact with them using just the voice commands. IVAs can elicit voice feedback directly from the users and use their responses to improve the various components of IVAs. One concern with using such crowdsourced voice feedback (CVF) data is the reliability of feedback itself such as background noise or disingenuous feedback. In this paper, we propose ways to establish confidence scores to indicate the reliability of the CVF data. We build a probabilistic Bayesian Belief Network (BBN) model, which uses the CVF data as training dataset. Since human annotation of the CVF data can be expensive, we explore ways to evaluate such a model without human labeled data. We propose several metrics that (i) do not need any ground-truth, (ii) can be simply computed using the CVF data, and (iii) can reliably measure the model performance to output confidence scores indicating reliability.