Publications

Amazon is a great place to practice science and have real business impact, but that's only one part of the story. Our scientists continue to publish, teach, and engage with the worldwide research community, sharing insights across diverse disciplines from machine learning to operations research. Through these contributions, we're advancing scientific knowledge while developing innovations that address complex challenges for customers and society.

1,724 results found

Sort

Semantic aligned multi-modal transformer for vision-language understanding: A preliminary study on visual QA

Han Ding, Erran Li, Zhiting Hu, Yi Xu, Dilek Hakkani-Tür, Zheng Du, Belinda Zeng

NAACL 2021 Workshop on Multimodal Artificial Intelligence

2021

Recent vision-language understanding approaches adopt a multi-modal transformer pretraining and finetuning paradigm. Prior work learns representations of text tokens and visual features with cross-attention mechanisms and captures the alignment solely based on indirect signals. In this work, we propose to enhance the alignment mechanism by incorporating image scene graph structures as the bridge between

Conversational AI
It is better to verify: Semi-supervised learning with a human in the loop for large-scale NLU models

Verena Weber, Enrico Piovano, Melanie Bradford

NACCL 2021 Workshop on Data Science with Human-in-the-loop

2021

When a NLU model is updated, new utterances must be annotated to be included for training. However, manual annotation is very costly. We evaluate a semi-supervised learning workflow with a human in the loop in a production environment. The previous NLU model predicts the annotation of the new utterances, a human then reviews the predicted annotation. Only when the NLU prediction is assessed as incorrect

Conversational AI
A novel framework for discovering cognitive models of learning

Jinjin Zhao, Candace Thille, Neelesh Gattani, Dawn Zimmaro

L@S 2021

2021

A cognitive model is a descriptive account or computational representation of human thinking about a given concept, skill, or domain. A cognitive model of learning, includes both a way of organizing knowledge within a subject area and an account of how humans develop accurate and complete knowledge of that subject area. Learning designers engage in a variety of practices to unpack knowledge from subject

Conversational AI
LaTeX-Numeric: Language-agnostic text attribute eXtraction for e-commerce numeric attributes

Kartik Mehta, Ioana Oprea, Nikhil Rasiwasia

NAACL 2021

2021

In this paper, we present LaTeX-Numeric - a high-precision fully-automated scalable framework for extracting E-commerce numeric attributes from product text like product description. Most of the past work on attribute extraction is not scalable as they rely on manually curated training data, either with or without the use of active learning. We rely on distant supervision for training data generation, removing

Conversational AI
Alexa Conversations: An extensible data-driven approach for building task-oriented dialogue systems

Anish Acharya, Suranjit Adhikari, Sanchit Agarwal, Vincent Auvray, Nehal Belgamwar, Arijit Biswas, Shubhra Chandra, Tagyoung Chung, Maryam Fazel-Zarandi, Raefer Gabriel, Shuyang Gao, Rahul Goel, Dilek Hakkani-Tür, Jan Jezabek, Abhay Jha, Jiun-Yu Kao, Prakash Krishnan, Peter Ku, Anuj Goyal, Chien-Wei Lin, Qing Liu, Arindam Mandal, Angeliki Metallinou, Vishal Naik, Yi Pan, Shachi Paul, Vittorio Perera, Abhishek Sethi, Minmin Shen, Nikko Ström, Eddie Wang

NAACL 2021

2021

Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and do not

Conversational AI
Towards modeling the style of translators in neural machine translation

Yue Wang, Cuong Hoang, Marcello Federico

NAACL 2021

2021

One key ingredient of neural machine translation is the use of large datasets from different domains and resources (e.g. Europarl, TED talks). These datasets contain documents translated by professional translators using different but consistent translation styles. Despite that, the model is usually trained in a way that neither explicitly captures the variety of translation styles present in the data nor

Conversational AI
Knowledge-driven slot constraints for goal-oriented dialogue systems

Piyawat Lertvittayakumjorn, Daniele Bonadiman, Saab Mansour

NAACL 2021

2021

In goal-oriented dialogue systems, users provide information through slot values to achieve specific goals. Practically, some combinations of slot values can be invalid according to external knowledge. For example, a combination of “cheese pizza” (a menu item) and “oreo cookies” (a topping) from an input utterance “Can I order a cheese pizza with oreo cookies on top?” exemplifies such invalid combinations

Conversational AI
Entity resolution in open-domain conversations

Mingyue Shang, Tong Wang, Mihail Eric, Jiangning Chen, Jiyang Wang, Matthew Welch, Tiantong Deng, Akshay Grewal, Han Wang, Yue Liu, Imre Kiss, Yang Liu, Dilek Hakkani-Tür

NAACL 2021

2021

In recent years, incorporating external knowledge for response generation in open-domain conversation systems has attracted great interest. To improve the relevance of retrieved knowledge, we propose a neural entity linking (NEL) approach. Different from formal documents such as news, conversational utterances are informal and multi-turn, which makes it more challenging to disambiguate the entities. Therefore

Conversational AI
Optimizing NLU reranking using entity resolution signals in multi-domain dialog systems

Tong Wang, Jiangning Chen, Mohsen Malmir, Shuyan Dong, Xin He, Han Wang, Chengwei Su, Yue Liu, Yang Liu

NAACL 2021

2021

In dialog systems, the Natural Language Understanding (NLU) component typically makes the interpretation decision (including domain, intent and slots) for an utterance before the mentioned entities are resolved. This may result in intent classification and slot tagging errors. In this work, we propose to leverage Entity Resolution (ER) features in NLU reranking and introduce a novel loss term based on ER

Conversational AI
Structured prediction as translation between augmented natural languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone, JIE MA, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

ICLR 2021

2021

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative

Conversational AI
Proteno: Text normalization with limited data for fast deployment in text-to-speech systems

Shubhi Tyagi, Antonio Bonafonte, Jaime Lorenzo Trueba, Javier Latorre

NAACL 2021

2021

Developing Text Normalization (TN) systems for Text-to-Speech (TTS) on new languages is hard. We propose a novel architecture to facilitate it for multiple languages while using data less than 3% of the size of the data used by the state of the art results on English. We treat TN as a sequence classification problem and propose a granular tokenization mechanism that enables the system to learn majority

Related: Text normalization with only 3% as much training data

Conversational AI
OodGAN: Generative adversarial network for out-of-domain data generation

Petr Marek, Vishal Naik, Vincent Auvray, Anuj Goyal

NAACL 2021

2021

Detecting an Out-of-Domain (OOD) utterance is crucial for a robust dialog system. Most dialog systems are trained on a pool of annotated OOD data to achieve this goal. However, collecting the annotated OOD data for a given domain is an expensive process. To mitigate this issue, previous works have proposed generative adversarial networks (GAN) based models to generate OOD data for a given domain automatically

Conversational AI
Zero-shot spoken language understanding for English-Hindi: An easy victory against word order divergence

Judith Gaspers, Quynh Ngoc Thi Do

ICLR 2021 Workshop on Practical ML for Developing Countries

2021

While the strong zero-shot performance of multilingual BERT has been shown to drop in case of word order divergence between source and target language, the problem has been studied rarely to date. In this paper, we explore light-weight techniques to improve BERT-based zero-shot spoken language understanding for English-Hindi, which are languages with divergent word orders. We show that word order divergence

Conversational AI
VoiSeR: A new benchmark for voice-based search refinement

Simone Filice, Giuseppe Castellucci, Marcus Collins, Eugene Agichtein, Oleg Rokhlenko

EACL 2021

2021

Voice assistants, e.g., Alexa or Google Assistant, have dramatically improved in recent years. Supporting voice-based search, exploration, and refinement are fundamental tasks for voice assistants, and remain an open challenge. For example, when using voice to search an online shopping site, a user often needs to refine their search by some aspect or facet. This common user intent is usually available through

Conversational AI
Linking entities to unseen knowledge bases with arbitrary schemas

Yogarshi Vyas, Miguel Ballesteros

NAACL 2021

2021

In entity linking, mentions of named entities in raw text are disambiguated against a knowledge base (KB). This work focuses on linking to unseen KBs that do not have training data and whose schema is unknown during training. Our approach relies on methods to flexibly convert entities with several attribute-value pairs from arbitrary KBs into flat strings, which we use in conjunction with state-of-the-art

Conversational AI

...

115

Publications

Latest news

Work with us