Search - Amazon Science

Flexible and scalable state tracking framework for goal-oriented dialogue systems

Rahul Goel, Shachi Paul, Tagyoung Chung, Jeremie Lecomte, Arindam Mandal, Dilek Hakkani-Tür

NeurIPS 2018

2018

Goal-oriented dialogue systems typically rely on components specifically developed for a single task or domain. This limits such systems in two different ways: If there is an update in the task domain, the dialogue system usually needs to be updated or completely re-trained. It is also harder to extend such dialogue systems to different and multiple domains. The dialogue state tracker in conventional dialogue

Conversational AI

Statistical Model Compression for Small-Footprint Natural Language Understanding

Grant Strimel, Kanthashree Mysore Sathyendra, Stanislav Peshterliev

Interspeech 2018

2018

In this paper we investigate statistical model compression applied to natural language understanding (NLU) models. Small-footprint NLU models are important for enabling offline systems on hardware restricted devices, and for decreasing on demand model loading latency in cloud-based systems. To compress NLU models, we present two main techniques, parameter quantization and perfect feature hashing. These

Conversational AI

What we need to learn if we want to do and not just talk

Rashmi Gangadharaiah, Balakrishnan (Murali) Narayanaswamy

NAACL 2018

2018

In task-oriented dialog, agents need to generate both fluent natural language responses and correct external actions like database queries and updates. We show that methods that achieve state of the art performance on synthetic datasets, perform poorly in real world dialog tasks. We propose a hybrid model, where nearest neighbor is used to generate fluent responses and Sequence-to-Sequence (Seq2Seq) type

Conversational AI

Selecting Machine-translated Data for Quick Bootstrapping of a Natural Language Understanding System

Judith Gaspers, Penny Karanasou, Rajen Chatterjee

NAACL 2018

2018

This paper investigates the use of Machine Translation (MT) to bootstrap a Natural Language Understanding (NLU) system for a new language for the use case of a large-scale voice-controlled device. The goal is to decrease the cost and time needed to get an annotated corpus for the new language, while still having a large enough coverage of user requests. Different methods of filtering MT data in order to

Conversational AI

Neural network based time-frequency masking and steering vector estimation for two-channel MVDR beamforming

Trausti Kristjansson

ICASSP 2018

2018

We present a neural network based approach to two-channel beamforming. First, single- and cross-channel spectral features are extracted to form a feature map for each utterance. A large neural network that is the concatenation of a convolution neural network (CNN), long short-term memory recurrent neural network (LSTMRNN) and deep neural network (DNN) is then employed to estimate frame-level speech and

Conversational AI

Phrase Break Prediction for Long-form Reading TTS: Exploiting Text Structure Information

Viacheslav Klimkov, Adam Nadolski, Alexis Moinet, Bartosz Putrycz, Roberto Barra-Chicote, Tom Merritt, Thomas Drugman

Interspeech 2017

2018

Phrasing structure is one of the most important factors in increasing the naturalness of text-to-speech (TTS) systems, in particular for long-form reading. Most existing TTS systems are optimized for isolated short sentences, and completely discard the larger context or structure of the text. This paper presents how we have built phrasing models based on data extracted from audiobooks. We investigate how

Conversational AI

Multichannel audio front-end for far-field automatic speech recognition

Amit S. Chhetri, Philip Hilmes, Trausti Kristjansson, Robert Ayrapetian, Wai Chu, Mohamed Mansour, Xiaoxue Li, Xianxian Zhang

EUSIPCO 2018

2018

Far-field automatic speech recognition (ASR) is a key enabling technology that allows untethered and natural voice interaction between users and Amazon Echo family of products. A key component in realizing far-field ASR on these products is the suite of audio front-end (AFE) algorithms that helps in mitigating acoustic environmental challenges and thereby improving the ASR performance. In this paper, we

Conversational AI

A scalable algorithm for higher-order features generation using MinHash

Pooja A, Naveen Nair, Rajeev Rastogi

CIKM 2018

2018

Linear models have been widely used in the industry for their low computation time, small memory footprint and interpretability. However, linear models are not capable of leveraging non-linear feature interactions in predicting the target. This limits their performance. A classical approach to overcome this limitation is to use combinations of the original features, referred to as higher-order features,

Machine learning

Combining Acoustic Embeddings and Decoding Features for End-of-Utterance Detection in Real-Time Far-Field Speech Recognition Systems

Roland Maas, Ariya Rastrow, Chengyuan Ma, Kyle Goehner, Gautam Tiwari, Shaun Joseph, Björn Hoffmeister

ICASSP 2018

2018

We present an end-of-utterance detector for real-time automatic speech recognition in far-field scenarios. The proposed system consists of three components: a long short-term memory (LSTM) neural network trained on acoustic features, an LSTM trained on 1-best recognition hypotheses of the automatic speech recognition (ASR) decoder, and a feedforward deep neural network (DNN) combining embeddings derived

Conversational AI

Deep factors with Gaussian processes for forecasting

Danielle Maddix Robinson, Yuyang (Bernie) Wang, Alex Smola

NeurIPS 2018

2018

A large collection of time series poses significant challenges for classical and neural forecasting approaches. Classical time series models fail to fit data well and to scale to large problems, but succeed at providing uncertainty estimates. The converse is true for deep neural networks. In this paper, we propose a hybrid model that incorporates the benefits of both approaches. Our new method is data-driven

Machine learning

A simple transfer-learning extension of Hyperband

Lazar Valkov, Rodolphe Jenatton, Fela Winkelmolen, Cédric Archambeau

NeurIPS 2018

2018

Hyperband has become a popular method to tune the hyperparameters (HPs) of expensive machine learning models, whose performance depends on the amount of resources allocated for training. While Hyperband is conceptually simple, combining random search to a successive halving technique to reallocate resources to the most promising HPs, it often outperforms standard Bayesian optimization when solutions with

Machine learning

Model checking boot code from AWS data centers

Byron Cook, Kareem Khazem, Daniel Kroening, Serdar Tasiran, Michael Tautschnig, Mark R. Tuttle

CAV 2018

2018

This paper describes our experience with symbolic model checking in an industrial setting. We have proved that the initial boot code running in data centers at Amazon Web Services is memory safe, an essential step in establishing the security of any data center. Standard static analysis tools cannot be easily used on boot code without modification owing to issues not commonly found in higher-level code,

Security, privacy, and abuse prevention

ProxQuant: Quantized neural networks via proximal operators

Yu Bai, Yu-Xiang Wang, Edo Liberty

ICLR 2018

2018

Deep neural networks are often desired in environments with limited memory and computational power (such as mobile devices), where it is beneficial to perform model quantization – training networks with low-precision weights. A key mechanism commonly used in training quantized nets is the straight-through gradient method, which enables back-propagation through the quantization mapping. Despite its success

Machine learning

Device-directed Utterance Detection

Sri Harish Mallidi, Roland Maas, Spyros Matsoukas, Björn Hoffmeister

Interspeech 2018

2018

In this work, we propose a classifier for distinguishing device-directed queries from background speech in the context of interactions with voice assistants. Applications include rejection of false wake-ups or unintended interactions as well as enabling wake-word free followup queries. Consider the example interaction: “Computer, play music”, “Computer, reduce the volume”. In this interaction, the user

Conversational AI

A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding

Young-Bum Kim, Dongchan Kim, Joo-Kyung Kim, Ruhi Sarikaya

NAACL 2018

2018

Intelligent personal digital assistants (IPDAs), a popular real-life application with spoken language understanding capabilities, can cover potentially thousands of overlapping domains for natural language understanding, and the task of finding the best domain to handle an utterance becomes a challenging problem on a large scale.

Conversational AI

MLZero: Towards zero touch machine learning

Tom Diethe, Tom Borchert, Eno Thereska, Borja de Balle Pigem, Cédric Archambeau, Neil Lawrence

NeurIPS 2018

2018

This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive Machine

Cloud and systems

Unsupervised quality estimation without reference corpus for subtitle machine translation using word embeddings

Prabhakar Gupta, Shaktisingh Shekhawat, Keshav Kumar

ICSC 2018

2018

We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for a parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English-source subtitles

Conversational AI

The Fact Extraction and VERification (FEVER) Shared Task

James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos, Arpit Mittal

EMNLP 2018

2018

We present the results of the first Fact Extraction and VERification (FEVER) Shared Task. The task challenged participants to classify whether human-written factoid claims could be SUPPORTED or REFUTED using evidence retrieved from Wikipedia. We received entries from 23 competing teams, 19 of which scored higher than the previously published baseline. The best performing system achieved a FEVER score of

Conversational AI

Supervised Domain Enablement Attention for Personalized Domain Classification

Joo-Kyung Kim, Young-Bum Kim

EMNLP 2018

2018

In large-scale domain classification for natural language understanding, leveraging each user’s domain enablement information, which refers to the preferred or authenticated domains by the user, with attention mechanism has been shown to improve the overall domain classification performance. In this paper, we propose a supervised enablement attention mechanism, which utilizes sigmoid activation for the

Conversational AI

Neural Machine Translation For Paraphrase Generation

Alex Sokolov, Denis Filimonov

NeurIPS 2018

2018

Training a spoken language understanding system, as the one in Alexa, typically requires a large human-annotated corpus of data. Manual annotations are expensive and time consuming. In Alexa Skill Kit (ASK) user experience with the skill greatly depends on the amount of data provided by skill developer. In this work, we present an automatic natural language generation system, capable of generating both

Conversational AI

Search results

Work with us