Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu

ICASSP 2023

2023

Self-supervised speech representation learning (S3RL) is revolutionizing the way we leverage the ever-growing availability of data. While S3RL related studies typically use large models, we employ light-weight networks to comply with tight memory of computeconstrained devices. We demonstrate the effectiveness of S3RL on a keyword-spotting (KS) problem by using transformers with 330k parameters and propose

Conversational AI
Dual-attention neural transducers for efficient wake word spotting in speech recognition

Saumya Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Anastasios Alexandridis, Grant Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju (Claire) Chang, Thanasis Mouchtaris, Siegfried Kunzmann

ICASSP 2023

2023

We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech recognition tasks. This architecture enables a dynamic switch for its runtime compute paths by exploiting WW spotting to select which branch of its attention networks to execute for an input audio frame. With this approach, we effectively improve WW spotting

Conversational AI
Multilingual end-to-end spoken language understanding for ultra-low footprint applications

Markus Müller, Anastasios Alexandridis, Zach Trozenski, Joel Whiteman, Grant Strimel, Nathan Susanj, Thanasis Mouchtaris, Siegfried Kunzmann

ICASSP 2023

2023

Tiny Signal-to-Interpretation (TinyS2I) has been recently introduced as an ultra low-footprint end-to-end spoken language understanding (SLU) model. This architecture is capable of running in ultra resource constrained environments like voice assistant devices, while at the same time reducing latency. In this work, we propose an extension to TinyS2I and train a multilingual system supporting several languages

Conversational AI
SumREN: Summarizing reported speech about events in news

Revanth Gangi Reddy, Heba Elfardy, Hou Pong Chan, Kevin Small, Heng Ji

AAAI 2023

2023

A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i.e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i.e., reported statements). However, existing work on news summarization almost exclusively focuses on the event details. In this work, we propose the

Conversational AI
Multiscale audio spectrogram transformer for efficient audio classification

Wentao Zhu, Mohamed Omar

ICASSP 2023

2023

Audio event has a hierarchical architecture in both time and frequency and can be grouped together to construct more abstract semantic audio classes. In this work, we develop a multiscale audio spectrogram Transformer (MAST) that employs hierarchical representation learning for efficient audio classification. Specifically, MAST employs one-dimensional (and two-dimensional) pooling operators along the time

Conversational AI

Alexa Skills Inventor boosts AI education

Staff writer

July 6, 2023

The program exposes students to computer science as they create their own Alexa skills.

Conversational AI
USC

“Who we are shapes what we say and how we say it”

Staff writer

July 5, 2023

Amazon Research Award recipient Shrikanth Narayanan is on a mission to make inclusive human-AI conversational experiences.

Conversational AI
How Alexa learned to speak with an Irish accent

Georgi Tinchev, Marta Czarnowska

July 3, 2023

With little training data and no mapping of speech to phonemes, Amazon researchers used voice conversion to generate Irish-accented training data in Alexa’s own voice.

Conversational AI
The science behind the improved Fire TV voice search

Sean O'Neill

June 26, 2023

How phonetically blended results (PBR) help ensure customers find the content they were actually asking for.

Conversational AI
More-inclusive speech recognition with cross-utterance rescoring

Venkatesh Ravichandran

June 9, 2023

In a top-3% paper at ICASSP, Amazon researchers adapt graph-based label propagation to improve speech recognition on underrepresented pronunciations.

Conversational AI
University of Michigan’s SEAGULL wins Alexa Prize SimBot Challenge

Alexa Prize team

June 7, 2023

Team earned $500,000 for its performance in a challenge focused on advancing next-generation virtual assistants that help humans complete real-world tasks by continuously learning.

Conversational AI

Conversational AI

Publications

Related content

Work with us