Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Analyzing the support-level for tips extracted from product reviews

Miriam Farber, David Carmel, Lital Kuchy, Avihai Mejer

SIGIR 2022

2022

Useful tips extracted from product reviews assist customers to take a more informed purchase decision, as well as making a better, easier, and safer usage of the product. In this work we argue that extracted tips should be examined based on the amount of support and opposition they receive from all product reviews. A classifier, developed for this purpose, determines the degree to which a tip is supported

Related: Model assesses the validity of tips offered in product reviews

Conversational AI
Large sequence representation learning via multi-stage latent transformers

Ionut Catalin Sandu, Daniel Voinea, Alin-Ionut Popa

COLING 2022

2022

We present LANTERN, a multi-stage transformer architecture for named-entity recognition (NER) designed to operate on indefinitely large text sequences (i.e. >> 512 elements). For a given image of a form with structured text, our method uses language and spatial features to predict the entity tags of each text element. It breaks the quadratic computational constraints of the attention mechanism by operating

Conversational AI
Entity anchored ICD coding

Jay DeYoung, Han-Chin Shing, Chris (Luyang) Kong, Christopher Winestock, Chaitanya Shivade

AMIA 2022

2022

Medical coding is a complex task, requiring assignment of a subset of over 72,000 ICD codes to a patient’s notes. Modern natural language processing approaches to these tasks have been challenged by the length of the input and size of the output space. We limit our model inputs to a small window around medical entities found in our documents. From those local contexts, we build contextualized representations

Conversational AI
Dialog acts for task-driven embodied agents

Spandana Gella, Aishwarya Padmakumar, Patrick Lange, Dilek Hakkani-Tür

SIGDIAL 2022

2022

Embodied agents need to be able to interact in natural language-understanding task descriptions and asking appropriate follow up questions to obtain necessary information to be effective at successfully accomplishing tasks for a wide range of users. In this work, we propose a set of dialog acts for modelling such dialogs and annotate the TEACh dataset that includes over 3,000 situated, task oriented conversations

Computer vision
MMT4: Multi modality to text transfer transformer

Amir Tavanaei, Karim Bouyarmane, Iman Keivanloo, Ismail Tutar

KDD 2022

2022

Recent studies have demonstrated the ability of auto-regressive and seq-to-seq generative models to reach state-of-the-art performance on various Natural Language Understanding (NLU) and Natural Language Processing (NLP) tasks. They operate by framing all the tasks in a single formulation: text auto-completion or text-to-text encoding-decoding. These models can be trained on the products corpus in order

Conversational AI

Training Speech Synthesizers on Data from Multiple Speakers

Jakub Lachowicz

April 25, 2019

When a customer asks Alexa to play “Hey Jude”, and Alexa responds, “Playing 'Hey Jude' by the Beatles,” that response is generated by a text-to-speech (TTS) system, which converts textual inputs into synthetic-speech outputs...

Conversational AI
Using wake word acoustics to filter out background speech improves speech recognition by 15%

Xing Fan

April 22, 2019

One of the ways that we’re always trying to improve Alexa’s performance is by teaching her to ignore speech that isn’t intended for her. At this year’s International Conference on Acoustics, Speech, and Signal Processing, my colleagues and I will present a new technique for doing this, which could complement the techniques that Alexa already uses.

Conversational AI
Two new papers discuss how Alexa recognizes sounds

Ming Sun

April 18, 2019

Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off. At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.

Conversational AI
Signal processor improves Echo’s bass response, loudness, and speech recognition accuracy

Jun Yang

April 11, 2019

Multiband dynamics processing, which separately modifies volume in different frequency bands of an audio signal, is known to improve listeners’ audio experiences. But in the context of voice-controlled systems like the Amazon Echo family of products, it can also improve automatic speech recognition by making echo cancellation easier.

Conversational AI
Cross-lingual transfer learning for bootstrapping AI systems reduces new-language data requirements

Quynh Ngoc Thi Do, Judith Gaspers

April 8, 2019

Transfer learning is the technique of adapting a machine learning model trained on abundant data to a new context in which training data is sparse. On the Alexa team, we’ve explored transfer learning as a way to bootstrap new functions and to add new classification categories to existing machine learning systems.

Conversational AI
New speech recognition experiments demonstrate how machine learning can scale

Sree Hari Krishnan Parthasarathi

April 4, 2019

Customer interactions with Alexa are constantly growing more complex, and on the Alexa science team, we strive to stay ahead of the curve by continuously improving Alexa’s speech recognition system. Increasingly, keeping pace with Alexa’s expanding capabilities will require automating the learning process, through techniques such as semi-supervised learning, which leverages a small amount of annotated data to extract information from a much larger store of unannotated data.

Machine learning

Conversational AI

Publications

Related content

Work with us