Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Constrained policy optimization for controlled contextual bandit exploration

Mohammad Kachuee, Sungjin Lee

IJCAI 2022 AI Safety Workshop

2022

Contextual bandits are widely used across the industry in many applications such as search engines, dialogue systems, recommendation systems, etc. In such applications, it is often necessary to update the policy regularly as the data distribution changes and new features are being on-boarded frequently. As any new policy deployment directly impacts the user experience, safety in model updates is an important

Conversational AI
Training naturalized semantic parsers with very little data

Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

IJCAI-ECAI 2022

2022

Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant. State-of-the-art (SOTA) semantic parsers are seq2seq architectures based on large language models that have been pretrained on vast amounts of text. To better leverage that pretraining, recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves

Conversational AI
CMA-CLIP: Cross-modality attention clip for text-image classification

Jinmiao Fu, Shaoyuan Xu, Huidong Liu, Yang Liu, Ning Xie, Chien-Chih Wang, Jia Liu, Yi Sun, Bryan Wang

IEEE ICIP 2022

2022

Multi-modal learning with both text and images benefits multiple applications, such as attribute extraction for e-commerce products. In this paper, we propose Cross-Modality Attention Contrastive Language-Image Pre-training (CMA-CLIP), a new multi-modal architecture to jointly learn the fine-grained inter-modality relationship. It fuses CLIP with a sequence-wise attention module and a modality-wise attention

Computer vision
Separator-transducer-segmenter: Streaming recognition and segmentation of multi-party speech

Ilya Sklyar, Anna Piunova, Christian Osendorfer

Interspeech 2022

2022

Streaming recognition and segmentation of multi-party conversations with overlapping speech is crucial for the next generation of voice assistant applications. In this work we address its challenges discovered in the previous work on multi-turn recurrent neural network transducer (MT-RNN-T) with a novel approach, separator-transducer-segmenter (STS), that enables tighter integration of speech separation

Conversational AI
RefTextLAS: Reference text biased listen, attend, and spell model for accurate reading evaluation

Phani Sankar Nidadavolu, Na Xu, Nick Jutila, Ravi Teja Gadde, Aswarth Abhilash Dara, Joseph Savold, Sapan Patel, Aaron Hoff, Veerdhawal Pande, Kevin Crews, Ankur Gandhe, Ariya Rastrow, Roland Maas

Interspeech 2022

2022

We present an automatic reading evaluator that listens to novice young readers and offers feedback based on the reading accuracy. In order to not discourage the reader, the model should not misrecognize correctly read tokens (false rejects), which may come at the expense of tolerating some reading mistakes (false accepts). To minimize the former, we explore two approaches to provide reference text – the

Conversational AI

Amazon releases dataset for complex, multilingual question answering

Priyanka Sen

October 5, 2022

Dataset that requires question-answering models to look up multiple facts and perform comparisons bridges a significant gap in the field.

Conversational AI
The science behind Alexa’s new interactive story-creation experience

Staff writer

September 28, 2022

AI models that generate stories, place objects in a visual scene, and assemble music on the fly customize content to children’s specifications.

Conversational AI
The science behind Amazon’s spatial audio-processing technology

Mike Luo

September 28, 2022

Combining psychoacoustics, signal processing, and speaker beamforming enhances stereo audio and delivers an immersive sound experience for customers.

Conversational AI
Alexa’s text-to-speech research at Interspeech 2022

Antonio Bonafonte

September 27, 2022

Highlighted papers focus on transference — of prosody, accent, and speaker identity.

Conversational AI
Hackathon promotes language diversity in speech technology

Staff writer

September 26, 2022

Two-day hackathon will provide a hands-on approach to building speech-technology applications for underrepresented languages; registration deadline is Sept. 30.

Conversational AI
Alexa’s spoken-language-understanding research at Interspeech 2022

Gokhan Tur

September 23, 2022

Methods for learning from noisy data, using phonetic embeddings to improve entity resolution, and quantization-aware training are a few of the highlights.

Conversational AI

Conversational AI

Publications

Related content

Work with us