Publications

Amazon is a great place to practice science and have real business impact, but that's only one part of the story. Our scientists continue to publish, teach, and engage with the worldwide research community, sharing insights across diverse disciplines from machine learning to operations research. Through these contributions, we're advancing scientific knowledge while developing innovations that address complex challenges for customers and society.

4,150 results found

Sort

Short Text Classiﬁcation Using Graph Convolutional Network

Kshitij Tayal, Nikhil Rao, SAURABH AGRAWAL, Karthik Subbian

NeurIPS 2019 Workshop on Graph Representation Learning

2019

Short text classiﬁcation is a fundamental problem in natural language processing, social network analysis, and e-commerce. Traditional approaches for classifying text do not generalize to short texts, due to the lack of structure that is prevalent in longer sentences and paragraphs. More recently, deep learning-based methods have been applied to this problem, with limited success. To overcome the limitations

Conversational AI
Controlling the Output Length of Neural Machine Translation

Surafel Melaku Lakew, Mattia Di Gangi , Marcello Federico

IWSLT 2019 International Workshop on Spoken Language Translation

2019

The recent advances introduced by neural machine translation (NMT) are rapidly expanding the application ﬁelds of machine translation, as well as reshaping the quality level to be targeted. In particular, if translations have to ﬁt some given layout, quality should not only be measured in terms of adequacy and ﬂuency, but also length. Exemplary cases are the translation of document ﬁles, subtitles, and

Conversational AI
Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Mattia Di Gangi , Robert Enyedi, Alessandra Brusadin, Marcello Federico

IWSLT 2019 International Workshop on Spoken Language Translation

2019

Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in our application

Conversational AI
Robustness to capitalization errors in named entity recognition

Yun Hyokun, Sravan Bodapati, Yaser Al-Onaizan

EMNLP 2019 Workshop on Noisy User-Generated Text

2019

Robustness to capitalization errors is a highly desirable characteristic of named entity recognizers, yet we find standard models for the task are surprisingly brittle to such noise. Existing methods to improve robustness to the noise completely discard given orthographic information, which significantly degrades their performance on well-formed text. We propose a simple alternative approach based on data

Conversational AI
Passage ranking with weak supervision

Peng Xu, Xiaofei Ma, Ramesh Nallapati, Bing Xiang

ICLR 2019 Second Workshop on Learning from Limited Labeled Data

2019

In this paper, we propose a weak supervision framework for neural ranking tasks based on the data programming paradigm (Ratner et al., 2016), which enables us to leverage multiple weak supervision signals from different sources. Empirically, we consider two sources of weak supervision signals, unsupervised ranking functions and semantic feature similarities. We train a BERT-based passageranking model (which

Information and knowledge management
Full page offline handwriting text recognition

Jonathan Chung, Thomas Delteil

ICDAR 2019 Workshop on Machine Learning

2019

Offline handwriting recognition with deep neural networks is usually limited to words or lines due to large computational costs. In this paper, a less computationally expensive full page offline handwritten text recognition framework is introduced. This framework includes a pipeline that locates handwritten text with an object detection neural network and recognises the text within the detected regions

Machine learning
Span-level model for relation extraction

Kalpit Dixit, Yaser Al-Onaizan

ACL 2019

2019

Relation Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions. Recent approaches for this spanlevel task have been token-level models which have inherent limitations. They cannot easily define and implement span-level features, cannot model overlapping entity mentions and have cascading errors due to the use of sequential

Conversational AI
P3O: policy-on policy-off policy optimization

Rasool Fakoor, Pratik Chaudhari, Alex Smola

UAI 2019

2019

On-policy reinforcement learning (RL) algorithms have high sample complexity while offpolicy algorithms are difficult to tune. Merging the two holds the promise to develop efficient algorithms that generalize across diverse environments. It is however challenging in practice to find suitable hyper-parameters that govern this trade off. This paper develops a simple algorithm named P3O that interleaves offpolicy

Machine learning
On acoustic modeling for broadband beamforming

Amit S. Chhetri, Mohamed Mansour, Wontak Kim, Guangdong Pan

EUSIPCO 2019

2019

In this work, we describe limitations of the free-field propagation model for designing broadband beamformers for microphone arrays on a rigid surface. Towards this goal, we describe a general framework for quantifying the microphone array performance in a general wave-field by directly solving the acoustic wave equation. The model utilizes Finite-Element-Method (FEM) for evaluating the response of the

Conversational AI
Topic modeling with Wasserstein autoencoders

Feng Nan, Ran Ding, Ramesh Nallapati, Bing Xiang

ACL 2019

2019

We propose a novel neural topic model in the Wasserstein autoencoders (WAE) framework. Unlike existing variational autoencoder based models, we directly enforce Dirichlet prior on the latent document-topic vectors. We exploit the structure of the latent space and apply a suitable kernel in minimizing the Maximum Mean Discrepancy (MMD) to perform distribution matching. We discover that MMD performs much

Conversational AI
FACSIMILE: Fast and accurate scans from an image in less than a second

David Smith, Matthew Loper, Sonny Hu, Paris Mavroidis, Javier Romero

ICCV 2019

2019

Current methods for body shape estimation either lack detail or require many images. They are usually architecturally complex and computationally expensive. We propose FACSIMILE (FAX), a method that estimates a detailed body from a single photo, lowering the bar for creating virtual representations of humans. Our approach is easy to implement and fast to execute, making it easily deployable. FAX uses an

Computer vision
Efficient learning on point clouds with basis point sets

Sergey Prokudin, Christoph Lassner, Javier Romero

ICCV 2019

2019

With an increased availability of 3D scanning technology, point clouds are moving into the focus of computer vision as a rich representation of everyday scenes. However, they are hard to handle for machine learning algorithms due to their unordered structure. One common approach is to apply occupancy grid mapping, which dramatically increases the amount of data stored and at the same time loses details

Computer vision
Towards Universal Dialogue Act Tagging for Task-Oriented Dialogues

Shachi Paul, Rahul Goel, Dilek Hakkani-Tür

Interspeech 2019

2019

Machine learning approaches for building task-oriented dialogue systems require large conversational datasets with labels to train on. We are interested in building task-oriented dialogue systems from human-human conversations, which may be available in ample amounts in existing customer care center logs or can be collected from crowd workers. Annotating these datasets can be prohibitively expensive. Recently

Related: New Alexa Research on Task-Oriented Dialogue Systems

Conversational AI
Finding the action: Spatial-temporal discriminative filter banks for action recognition

Brais Martinez Alonso, Davide Modolo, Yuanjun Xiong, Joe Tighe

ICCV 2019

2019

Action recognition has seen a dramatic performance improvement in the last few years. Most of the current state-of-the-art literature either aims at improving performance through changes to the backbone CNN network, or they explore different trade-offs between computational efficiency and performance, again through altering the backbone network. However, almost all of these works maintain the same last

Computer vision
Improving ASR confidence scores for Alexa using acoustic and hypothesis embeddings

Prakhar Swarup, Roland Maas, Sri Garimella, Sri Harish Mallidi, Björn Hoffmeister

Interspeech 2019

2019

In automatic speech recognition, confidence measures provide a quantitative representation used to assess the reliability of generated hypothesis text. For personal assistant devices like Alexa, speech recognition errors are inevitable due to the growing number of applications. Hence, confidence scores provide an important metric to downstream consumers to gauge the correctness of ASR hypothesis text and

Conversational AI

...

251

252

253

...

277

Publications

Latest news

Work with us