Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Towards reasoning-aware explainable VQA

Rakesh Vaideeswaran Mahesh, Feng Gao, Abhinav Mathur, Govind Thattai

NeurIPS 2022 Workshop on Trustworthy and Socially Responsible Machine Learning (TSRML)

2022

The domain of joint vision-language understanding, especially in the context of reasoning in Visual Question Answering (VQA) models, has garnered significant attention in the recent past. While most of the existing VQA models focus on improving the accuracy of VQA, the way models arrive at an answer is oftentimes a black box. As a step towards making the VQA task more explainable and interpretable, our

Computer vision
Low resource retrieval augmented adaptive neural machine translation

Harsha Vardhan, Anurag Beniwal, Narayanan Sadagopan, Swair Shah

NeurIPS 2022 Workshop on Trustworthy and Socially Responsible Machine Learning (TSRML)

2022

We propose KNN-Kmeans MT, a sample efficient algorithm that improves retrieval based augmentation performance in low resource settings by adding an additional K-means filtering layer after the KNN step. KNN-Kmeans MT like its predecessor retrieval augmented machine translation approaches (Khandelwal et al. [2020]) doesn’t require any additional training and outperforms the existing methods in low resource

Conversational AI
Multimodal context carryover

Prashan Wanigasekara, Nalin Gupta, Fan Yang, Emre Barut, Zeynab Raeesy, Kechen Qin, Stephen Rawls, Xinyue Liu, Chengwei Su, Spurthi Sandiri

EMNLP 2022

2022

Multi-modality support has become an integral part of creating a seamless user experience with modern voice assistants with smart displays. Users refer to images, video thumbnails, or the accompanying text descriptions on the screen through voice communication with AI powered devices. This raises the need to either augment existing commercial voice only dialogue systems with state-of-the-art multimodal

Conversational AI
Pyramid dynamic inference: Encouraging faster inference via early exit boosting

Ershad Banijamali, Pegah Kharazmi, Sepehr Eghbali, Jixuan Wang, Clement Chung, Samridhi Choudhary

NeurIPS 2022 Workshop on Efficient Natural Language and Speech Processing (ENLSP), ICASSP 2023

2022

Transformer-based models demonstrate state of the art results on several natural language understanding tasks. However, their deployment comes at the cost of increased footprint and inference latency, limiting their adoption to real-time applications. Early exit strategies are designed to speed-up the inference by routing out a subset of samples at the earlier layers of the model. Exiting early causes losing

Conversational AI
GEMv2: Multilingual NLG benchmarking in a single line of code

Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanch, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza Jolly, Simon Mille, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

EMNLP 2022

2022

Evaluations in machine learning rarely use the latest metrics, datasets, or human evaluation in favor of remaining compatible with prior work. The compatibility, often facilitated through leaderboards, thus leads to outdated but standardized evaluation practices. We pose that the standardization is taking place in the wrong spot. Evaluation infrastructure should enable researchers to use the latest methods

Conversational AI

How Amazon scientists are driving success for Alexa in the car

John Roach

April 12, 2023

From noisy cars to unreliable signals, researchers have worked to extend the Alexa experience to vehicles on the move.

Conversational AI
Five finalists selected for inaugural Alexa Prize SimBot Challenge

Alexa Prize team

April 6, 2023

University teams are competing to help advance the science of conversational embodied AI and robust human AI interaction.

Conversational AI
How Amazon Chime SDK’s voice tone analysis works

Masahito Togami, Mike Goodwin

April 3, 2023

Combining acoustic and lexical information improves real-time voice sentiment analysis.

Conversational AI
Amazon, MIT research symposium focused on cutting-edge technology

Staff writer

March 31, 2023

Attendees explored new avenues of research in areas including robotics and conversational AI via roundtables moderated by researchers from Amazon.

Conversational AI
IIT Bombay

Amazon and IIT Bombay launch multiyear collaboration

Staff writer

March 27, 2023

Initiative will advance artificial intelligence and machine learning research within speech, language, and multimodal-AI domains.

Conversational AI
Neural encoding enables more-efficient recovery of lost audio packets

Jean-Marc Valin, Mike Goodwin

March 24, 2023

By leveraging neural vocoding, Amazon Chime SDK’s new deep-redundancy (DRED) technology can reconstruct long sequences of lost packets with little bandwidth overhead.

Conversational AI

Conversational AI

Publications

Related content

Work with us