-
ICASSP 20222022Second-pass rescoring is an important component in automatic speech recognition (ASR) systems that is used to improve the outputs from a first-pass decoder by implementing a lattice rescoring or n-best re-ranking. While pretraining with a masked language model (MLM) objective has received great success in various natural language understanding (NLU) tasks, it has not gained traction as a rescoring model
-
The Web Conference 20222022Explainable recommendation seeks to provide not only high-quality recommendations but also intuitive explanations. Our objective is not on generating accurate recommendations per se, but on producing user-friendly explanations through recommendation captions. Importantly, the focus of existing work has been predominantly on explaining a single item recommendation. In e-commerce websites, product recommendations
-
ICASSP 20222022Maximum Likelihood Estimation (MLE) is currently the most common approach to train large scale speech recognition systems. While it has significant practical advantages, MLE exhibits several drawbacks known in literature: training and inference conditions are mismatched and a proxy objective is optimized instead of word error rate. Recently, the Optimal Completion Distillation (OCD) training method was
-
ICASSP 20222022Training speaker-discriminative and robust speaker verification systems without speaker labels is still challenging and worthwhile to explore. In this study, we propose an effective self-supervised learning framework and a novel regularization strategy to facilitate self-supervised speaker representation learning. Different from contrastive learning-based self-supervised learning methods, the proposed self-supervised
-
AAAI 20222022Robots operating in human spaces must be able to engage in natural language interaction, both understanding and executing instructions, and using conversation to resolve ambiguity and correct mistakes. To study this, we introduce TEACh, a dataset of over 3,000 human–human, interactive dialogues to complete household tasks in simulation. A Commander with access to oracle information about a task communicates
Related content
-
October 3, 2023Team TWIZ from NOVA School of Science and Technology awarded $500,000 prize for first-place overall performance.
-
September 20, 2023Leveraging large language models will make interactions with Alexa more natural and engaging.
-
September 12, 2023GauchoChat wins $250,000 first place prize in overall competition; Chirpy Cardinal earns $250,000 for first place in scientific innovation category.
-
August 28, 2023AWS service enables machine learning innovation on a robust foundation.
-
August 23, 2023Senior principal scientist Jasha Droppo on the shared architectures of large language models and spectrum quantization text-to-speech models — and other convergences between the two fields.
-
August 18, 2023Speech recognition predominates, but Amazon's research takes in data representation, dialogue management, question answering, and more.