Conversational AI

Amazon’s 23 papers at EMNLP 2021

Natural-language understanding and question answering are areas of focus, with additional topics ranging from self-learning to text summarization.

November 5, 2021

5 min read

Of the 23 papers that Amazon researchers are presenting at next week's Conference on Empirical Methods in Natural Language Processing (EMNLP), the majority concentrate on two topics: natural-language understanding, or the semantic interpretation of text, and question answering, both of which are important across Amazon businesses, including Alexa, Amazon Web Services, and the Amazon Store.

The remaining 10 papers cover a range of topics, from self-learning and information retrieval to language modeling and machine translation.

The framework of the meta teacher-student network (MetaTS), a teacher-student framework that allows the teacher to dynamically adapt its pseudoannotation strategies by the student’s feedback. Figure from "MetaTS: Meta teacher-student network for multilingual sequence labeling with minimal supervision".

Within the area of natural-language understanding, Amazon researchers apply a battery of techniques — such as semi-supervised learning, few-shot learning, and contrastive learning — to a variety of subproblems, such as visual referring-expression recognition, or identifying which object in an image a natural-language expression refers to; coreference resolution, or determining whether different terms refer to the same entity; and dealing with distribution shift, or a mismatch between the distribution of data at inference time and the distribution in the training set.

Feedback attribution for counterfactual bandit learning in multi-domain spoken language understanding
Tobias Falke, Patrick Lehnen
MetaTS: Meta teacher-student network for multilingual sequence labeling with minimal supervision
Zheng Li, Danqing Zhang, Tianyu Cao, Ying Wei, Yiwei Song, Bing Yin
Mind the context: The impact of contextualization in neural module networks for grounding visual referring expression
Arjun R. Akula, Spandana Gella, Keze Wang, Song-Chun Zhu, Siva Reddy
Nearest neighbor few-shot learning for cross-lingual classification
M. Saiful Bari, Batool Haider, Saab Mansour
ODIST: Open world classification via distributionally shifted instances
Lei Shu, Yassine Benajiba, Saab Mansour, Yi Zhang
Pairwise supervised contrastive learning of sentence representations
Dejiao Zhang, Shang-Wen Li, Wei Xiao, Henghui Zhu, Ramesh Nallapati, Andrew O. Arnold, Bing Xiang
Sequential cross-document coreference resolution
Emily Allaway, Shuai Wang, Miguel Ballesteros

Amazon researchers’ work on question answering includes helping conversational-AI agents suggest follow-up questions during interactions with customers; filtration of unanswerable questions to prevent the waste of system resources; and few-shot learning.

A new approach to few-shot learning for question answering formulates the task as masked span filling during fine-tuning. This enables the use of the pretraining objective during fine-tuning, making the system extremely sample efficient. *Top:* Pretraining framework; *middle:* existing fine-tuning frameworks; *bottom:* proposed fine-tuning framework. Figure from "FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models".

Amazon Web Services researchers address questions of fairness in a paper on mitigating gender bias in machine translation models.

GFST: Gender-filtered self-training for more accurate gender in translation
Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu

In the area of information retrieval, Amazon papers investigate an integrated model for conversational search and the identification of counterfactual claims in product reviews that can create a misleading impression of the reviewer’s sentiment.

End-to-end conversational search for online shopping with utterance transfer
Liqiang Xiao, Jun Ma, Xin Luna Dong, Pascual Martinez-Gomez, Nasser Zalmout, Chenwei Zhang, Tong Zhao, Hao He, Yaohui Jin
I wish I would have loved this one, but I didn’t: A multilingual dataset for counterfactual detection in product reviews
James O'Neill, Polina Rozenshtein, Ryuichi Kiryo, Motoko Kubota, Danushka Bollegala

A pair of Amazon papers look at the type of language modeling that accounts for so much of the recent success of natural-language-processing models.

Alexa researchers combined data mixing and elastic weight consolidation to improve the adaptation of machine translation models to new tasks.

Improving the quality trade-off for neural machine translation multi-domain adaptation
Eva Hasler, Tobias Domhan, Jonay Trenous, Ke Tran, Bill Byrne, Felix Hieber

Paraphrase generation varies the surface form of sentences while preserving their semantic content, so it can help augment training data for other natural-language-processing tasks.

Learning to selectively learn for weakly-supervised paraphrase generation
Kaize Ding, Dingcheng Li, Alexander Hanbo Li, Xing Fan, Chenlei (Edward) Guo, Yang Liu, Huan Liu

Self-learning is the use of implicit feedback signals to automatically improve machine learning models, without the need for human intervention.

Implicit feedback.png — Interrupting a conversational-AI agent to rephrase a request provides an *implicit-feedback* signal that can be used to automatically label training data, which can help improve the underlying machine learning model. Figure from "A scalable framework for learning from implicit user feedback to improve natural language understanding in large-scale conversational AI systems".

A scalable framework for learning from implicit user feedback to improve natural language understanding in large-scale conversational AI systems
Sunghyun Park, Han Li, Ameen Patel, Sidharth Mudgal, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya
Contextual rephrase detection for reducing friction in dialogue system
Zhuoyi Wang, Saurabh Gupta, Jie Hao, Xing Fan, Dingcheng Li, Alexander Hanbo Li, Chenlei (Edward) Guo

Text summarization is a widely studied problem in natural-language processing, and a new paper from Amazon Web Services considers the particular problems it presents in the context of dialogue.

A bag of tricks for dialogue summarization
Muhammad Khalifa, Miguel Ballesteros, Kathleen McKeown

For more on Amazon's involvement at EMNLP, see our interview with Georgiana Dinu, an applied scientist with Amazon Web Services and a conference area chair for machine learning for natural-language-processing.

About the Author

Larry Hardesty

Larry Hardesty is the editor of the Amazon Science blog. Previously, he was a senior editor at MIT Technology Review and the computer science writer at the MIT News Office.

Amazon’s 23 papers at EMNLP 2021

Natural-language understanding and question answering are areas of focus, with additional topics ranging from self-learning to text summarization.

Related content

Work with us