Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

OpenFEAT: Improving speaker identification by open-set few-shot embedding adaptation with Transformer

Kishan K C, Zhenning Tan, Long Chen, Minho Jin, Eunjung Han, Andreas Stolcke, Chul Lee

ICASSP 2022

2022

Household speaker identification with few enrollment utterances is an important yet challenging problem, especially when household members share similar voice characteristics and room acoustics. A common embedding space learned from a large number of speakers is not universally applicable for the optimal identification of every speaker in a household. In this work, we first formulate household speaker identification

Conversational AI
Mitigating closed-model adversarial examples with Bayesian neural modeling for enhanced end-to-end speech recognition

Huck Yang, Zeeshan Ahmed, Yi Gu, Joseph Szurley, Roger Ren, Linda Liu, Andreas Stolcke, Ivan Bulyko

ICASSP 2022

2022

In this work, we aim to enhance the system robustness of end-to-end automatic speech recognition (ASR) against adversarially-noisy speech examples. We focus on a rigorous and empirical “closed model adversarial robustness” setting (e.g., on-device or cloud applications). The adversarial noise is only generated by closed-model optimization (e.g., evolutionary and zeroth-order estimation) without accessing

Security, privacy, and abuse prevention
OA-Mine: Open-world attribute mining for e-commerce products with weak supervision

Xinyang Zhang, Chenwei Zhang, Xian Li, Xin Luna Dong, Jingbo Shang, Christos Faloutsos, Jiawei Han

The Web Conference 2022

2022

Automatic extraction of product attributes from their textual descriptions is essential for online shopper experience. One inherent challenge of this task is the emerging nature of e-commerce products — we see new types of products with their unique set of new attributes constantly. Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose

Information and knowledge management
Listen, know and spell: Knowledge-infused subword modeling for improving ASR performance of out-of-vocabulary (OOV) named entities

Nilaksh Das, Monica Sunkara, Dhanush Bekal, Duen Horng Chau, Sravan Bodapati, Katrin Kirchhoff

ICASSP 2022

2022

Automatic speech recognition (ASR) is increasingly being used in specialized domains such as medical ASR and news transcription. Owing to the lack of high quality annotated speech data in such domains, off-the-shelf models are commonly employed by fine-tuning on domain-specific data. This poses a significant challenge in transcribing long-tail expressions and out-of-vocabulary (OOV) named entities. On the

Conversational AI
Multi-modal, multi-task learning for Memotion 2.0 challenge

Gwang Gook Lee, Mingwei Shen

AAAI 2022 DE-FACTIFY Workshop: Multi-Modal Fake News and Hate-Speech Detection

2022

Over the years, memes became very popular as social media services growing rapidly. Understanding meme images as humans do is very complicated because of its multi-modal nature (texts on images). In this paper, we describe our approach for classifying sentiment and emotion of memes for Memotion 2.0 challenge. Assuming correlation between three sub-tasks, we implemented and compared four different multi-task

Conversational AI

AmazonNext program hosts final project presentations at Virginia HQ2

Staff writer

December 23, 2022

Program focuses on diversifying tech-industry talent.

Conversational AI
Auto-translating "Dive into Deep Learning" with Amazon Translate

Yunfei Bai, Rachel Hu, Anna Currey

December 22, 2022

A system built on Amazon Translate reduces the workload of human translators.

Conversational AI
Courtesy of Interspeech

How a lifelong music student uses melody and lyrics in TTS research

Staff writer

December 20, 2022

Ariadna Sanchez, a scientist who works in polyglot text to speech, draws on her musical background to help find novel solutions.

Conversational AI
Controlling formality in machine translation

Maria Nădejde, Benjamin Hsu

December 19, 2022

Transfer learning using limited contrastive data improves formality accuracy without compromising performance.

Conversational AI
Data-efficient continual learning in Alexa

Pradeep Natarajan

December 14, 2022

EMNLP papers examine constrained generation of rewrite candidates and automatic selection of information-rich training data.

Conversational AI
reMARS revisited: Human-like reasoning for an AI

Staff writer

December 13, 2022

Learn what goes into Amazon's effort to develop human-like reasoning for Alexa.

Conversational AI

Conversational AI

Publications

Related content

Work with us