-
ICASSP 20222022Household speaker identification with few enrollment utterances is an important yet challenging problem, especially when household members share similar voice characteristics and room acoustics. A common embedding space learned from a large number of speakers is not universally applicable for the optimal identification of every speaker in a household. In this work, we first formulate household speaker identification
-
ICASSP 20222022In this work, we aim to enhance the system robustness of end-to-end automatic speech recognition (ASR) against adversarially-noisy speech examples. We focus on a rigorous and empirical “closed model adversarial robustness” setting (e.g., on-device or cloud applications). The adversarial noise is only generated by closed-model optimization (e.g., evolutionary and zeroth-order estimation) without accessing
-
The Web Conference 20222022Automatic extraction of product attributes from their textual descriptions is essential for online shopper experience. One inherent challenge of this task is the emerging nature of e-commerce products — we see new types of products with their unique set of new attributes constantly. Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose
-
ICASSP 20222022Automatic speech recognition (ASR) is increasingly being used in specialized domains such as medical ASR and news transcription. Owing to the lack of high quality annotated speech data in such domains, off-the-shelf models are commonly employed by fine-tuning on domain-specific data. This poses a significant challenge in transcribing long-tail expressions and out-of-vocabulary (OOV) named entities. On the
-
AAAI 2022 DE-FACTIFY Workshop: Multi-Modal Fake News and Hate-Speech Detection2022Over the years, memes became very popular as social media services growing rapidly. Understanding meme images as humans do is very complicated because of its multi-modal nature (texts on images). In this paper, we describe our approach for classifying sentiment and emotion of memes for Memotion 2.0 challenge. Assuming correlation between three sub-tasks, we implemented and compared four different multi-task
Related content
-
December 23, 2022Program focuses on diversifying tech-industry talent.
-
December 22, 2022A system built on Amazon Translate reduces the workload of human translators.
-
December 20, 2022Ariadna Sanchez, a scientist who works in polyglot text to speech, draws on her musical background to help find novel solutions.
-
December 19, 2022Transfer learning using limited contrastive data improves formality accuracy without compromising performance.
-
December 14, 2022EMNLP papers examine constrained generation of rewrite candidates and automatic selection of information-rich training data.
-
December 13, 2022Learn what goes into Amazon's effort to develop human-like reasoning for Alexa.