-
ICASSP 20222022Second-pass rescoring is an important component in automatic speech recognition (ASR) systems that is used to improve the outputs from a first-pass decoder by implementing a lattice rescoring or n-best re-ranking. While pretraining with a masked language model (MLM) objective has received great success in various natural language understanding (NLU) tasks, it has not gained traction as a rescoring model
-
The Web Conference 20222022Explainable recommendation seeks to provide not only high-quality recommendations but also intuitive explanations. Our objective is not on generating accurate recommendations per se, but on producing user-friendly explanations through recommendation captions. Importantly, the focus of existing work has been predominantly on explaining a single item recommendation. In e-commerce websites, product recommendations
-
ICASSP 20222022Maximum Likelihood Estimation (MLE) is currently the most common approach to train large scale speech recognition systems. While it has significant practical advantages, MLE exhibits several drawbacks known in literature: training and inference conditions are mismatched and a proxy objective is optimized instead of word error rate. Recently, the Optimal Completion Distillation (OCD) training method was
-
ICASSP 20222022Training speaker-discriminative and robust speaker verification systems without speaker labels is still challenging and worthwhile to explore. In this study, we propose an effective self-supervised learning framework and a novel regularization strategy to facilitate self-supervised speaker representation learning. Different from contrastive learning-based self-supervised learning methods, the proposed self-supervised
-
AAAI 20222022Robots operating in human spaces must be able to engage in natural language interaction, both understanding and executing instructions, and using conversation to resolve ambiguity and correct mistakes. To study this, we introduce TEACh, a dataset of over 3,000 human–human, interactive dialogues to complete household tasks in simulation. A Commander with access to oracle information about a task communicates
Related content
-
June 16, 2021Relative to human evaluation of question-answering models, the new method has an error rate of only 7%.
-
June 15, 2021Alexa Fund company unlocks voice-based computing for people who have trouble using their voices.
-
June 11, 2021Proteno model dramatically increases the efficiency of the first step in text-to-speech conversion.
-
June 10, 2021Recasting different natural-language tasks in the same form dramatically improves few-shot multitask learning.
-
June 04, 2021Topics range from the predictable, such as speech recognition and noise cancellation, to singing separation and automatic video dubbing.
-
June 03, 2021Rastrow discussed the continued challenges and expanded role of speech recognition, and some of the interesting research and themes that emerged from ICASSP 2021.