Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Local-to-global learning for iterative training of production SLU models on new features

Yulia Grishina, Daniil Sorokin

NAACL 2022

2022

In production SLU systems, new training data becomes available with time so that ML models need to be updated on a regular basis. Specifically, releasing new features adds new classes of data while the old data remains constant. However, retraining the full model each time from scratch is computationally expensive. To address this problem, we propose to consider production releases from the curriculum learning

Conversational AI
Improving distantly supervised document-level relation extraction through natural language inference

Clara Vania, Grace E. Lee, Andrea Pierleoni

NAACL 2022 Workshop on Deep Learning for Low-Resource NLP

2022

The distant supervision (DS) paradigm has been widely used for relation extraction (RE) to alleviate the need for expensive annotations. However, it suffers from noisy labels, which leads to worse performance than models trained on human-annotated data, even when trained using hundreds of times more data. We present a systematic study on the use of natural language inference (NLI) to improve distantly supervised

Conversational AI
Sports narrative enhancement with natural language generation

Henry Wang, Saman Sarraf, Arbi Tamrazian

MIT Sloan Sports Analytics Conference 2022

2022

Sports broadcasters are increasingly sharing statistical insights throughout the game to tell a richer story for the audience. Thanks to abundant data and advanced statistics, broadcasters can quickly tell stories and make comparisons between teams and players to keep viewers engaged. To keep up with the fast-paced nature of many games, broadcasters rely on template-generated narratives to speak about in-game

Conversational AI
Differentially private bias-term only fine-tuning of foundation models

Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

NeurIPS 2022 Workshop on Trustworthy and Socially Responsible Machine Learning (TSRML)

2022

We study the problem of differentially private (DP) fine-tuning of large pre-trained models — a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data. Existing work has demonstrated that high accuracy is possible under strong privacy constraint, yet requires significant computational overhead or modifications to the network architecture. We propose differentially private

Related: Differential privacy for deep learning at GPT scale

Security, privacy, and abuse prevention
Meta-learning via language model in-context tuning

Yanda Chen, Sheng Zha, George Karypis, He He, Ruiqi Zhong

ACL 2022

2022

The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples. Inspired by the recent progress in large language models, we propose in-context tuning (ICT), which recasts task adaptation and prediction as a simple sequence prediction problem: to form the input sequence, we concatenate the task instruction, labeled in-context examples, and the target input to predict; to metatrain

Conversational AI

Training Speech Synthesizers on Data from Multiple Speakers

Jakub Lachowicz

April 25, 2019

When a customer asks Alexa to play “Hey Jude”, and Alexa responds, “Playing 'Hey Jude' by the Beatles,” that response is generated by a text-to-speech (TTS) system, which converts textual inputs into synthetic-speech outputs...

Conversational AI
Using wake word acoustics to filter out background speech improves speech recognition by 15%

Xing Fan

April 22, 2019

One of the ways that we’re always trying to improve Alexa’s performance is by teaching her to ignore speech that isn’t intended for her. At this year’s International Conference on Acoustics, Speech, and Signal Processing, my colleagues and I will present a new technique for doing this, which could complement the techniques that Alexa already uses.

Conversational AI
Two new papers discuss how Alexa recognizes sounds

Ming Sun

April 18, 2019

Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off. At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.

Conversational AI
Signal processor improves Echo’s bass response, loudness, and speech recognition accuracy

Jun Yang

April 11, 2019

Multiband dynamics processing, which separately modifies volume in different frequency bands of an audio signal, is known to improve listeners’ audio experiences. But in the context of voice-controlled systems like the Amazon Echo family of products, it can also improve automatic speech recognition by making echo cancellation easier.

Conversational AI
Cross-lingual transfer learning for bootstrapping AI systems reduces new-language data requirements

Quynh Ngoc Thi Do, Judith Gaspers

April 8, 2019

Transfer learning is the technique of adapting a machine learning model trained on abundant data to a new context in which training data is sparse. On the Alexa team, we’ve explored transfer learning as a way to bootstrap new functions and to add new classification categories to existing machine learning systems.

Conversational AI
New speech recognition experiments demonstrate how machine learning can scale

Sree Hari Krishnan Parthasarathi

April 4, 2019

Customer interactions with Alexa are constantly growing more complex, and on the Alexa science team, we strive to stay ahead of the curve by continuously improving Alexa’s speech recognition system. Increasingly, keeping pace with Alexa’s expanding capabilities will require automating the learning process, through techniques such as semi-supervised learning, which leverages a small amount of annotated data to extract information from a much larger store of unannotated data.

Machine learning

Conversational AI

Publications

Related content

Work with us