-
Interspeech 20222022In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins. Following the success of pre-trained models, we use low-level speech representations from a self-supervised representation learning model for our downstream classification task. Further, we propose a novel technique to infuse lexical
-
Interspeech 20222022We present a novel sub-8-bit quantization-aware training (S8BQAT) scheme for 8-bit neural network accelerators. Our method is inspired from Lloyd-Max compression theory with practical adaptations for a feasible computational overhead during training. With the quantization centroids derived from a 32-bit baseline, we augment training loss with a Multi-Regional Absolute Cosine (MRACos) regularizer that aggregates
-
KDD 20222022We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system. Though we train using 70% spoken-form data, our teacher models perform comparably
-
NAACL 2022 Workshop on Semantic Evaluation2022We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify complex named entities (like media titles, products, and groups) in 11 languages in both monolingual and multi-lingual scenarios. Eleven tracks were for building monolingual NER models for individual languages, one track focused on multilingual
-
Interspeech 20222022Conversational agents commonly utilize keyword spotting (KWS) to initiate voice interaction with the user. For user experience and privacy considerations, existing approaches to KWS largely focus on accuracy, which can often come at the expense of introduced latency. To address this tradeoff, we propose a novel approach to control KWS model latency and which generalizes to any loss function without explicit
Related content
-
June 02, 2021More-autonomous machine learning systems will make Alexa more self-aware, self-learning, and self-service.
-
June 01, 2021The event is over, but Amazon Science interviewed each of the six speakers within the Science of Machine Learning track. See what they had to say.
-
May 26, 2021Teams from three continents will compete to develop agents that assist customers in completing multi-step tasks.
-
May 19, 2021Calibrating noise addition to word density in the embedding space improves utility of privacy-protected text.
-
May 14, 2021Principal scientist will be recognized at Interspeech 2021.
-
May 12, 2021Ström discusses his career journey in conversational AI, his published research, and where he sees the field of conversational AI headed next