Search - Amazon Science

Video contrastive learning with global context (VCLR)

Haofei Kuang, Yi Zhu, Zhi Zhang, Xinyu Li, Joe Tighe, Sören Schwertfeger, Cyrill Stachniss, Mu Li

2021

Contrastive learning has revolutionized the self-supervised image representation learning field and recently been adapted to the video domain. One of the greatest advantages of contrastive learning is that it allows us to flexibly define powerful loss objectives as long as we can find a reasonable way to formulate positive and negative samples to contrast. However, existing approaches rely heavily on the

Computer vision

Improving factual consistency of abstractive text summarization

Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, Bing Xiang

2021

We provide the code for the papers: "Entity-level factual consistency of abstractive text summarization", EACL 2021. We provide a set of new metrics to quantify the entity-level factual consistency of generated summaries. We also provide code for the two methods in our paper: JAENS: joint entity and summary generation, and Summary-worthy entity classification with summarization (multi-task learning) "Improving

Conversational AI

Generalized fairness metrics

Paula Czarnowska, Yogarshi Vyas, Kashif Shah

2021

Measuring bias is key for better understanding and addressing unfairness in NLP/ML models. This is often done via fairness metrics which quantify the differences in a model's behaviour across a range of demographic groups. In this work, we shed more light on the differences and similarities between the fairness metrics used in NLP. First, we unify a broad range of existing metrics under three generalized

Unified-EPT

Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo Wu, Yanwei Fu, Mu Li

2021

Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries. Most literature either focuses on context modeling or boundary refinement, which is less generalizable in open-world scenarios. In this work, we advocate a unified framework (UN-EPT) to segment objects by considering both context information and boundary artifacts

Computer vision

Long short-term transformer for online action detection

Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Xia, Zhuowen Tu, Stefano Soatto

2021

We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs a long- and short-term memory mechanism to model prolonged sequence data. It consists of an LSTR encoder that dynamically leverages coarse-scale historical information from an extended temporal window (e.g., 2048 frames spanning of up to 8 minutes), together with an LSTR decoder that focuses

Computer vision

SCCL: Supporting clustering with contrastive learning

Dejiao Zhang, Feng Nan, Xiaokai Wei, Daniel Li, Henghui Zhu, Kathleen McKeown, Ramesh Nallapati, Andrew O. Arnold, Bing Xiang

2021

Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space. However, different categories often overlap with each other in the representation space at the beginning of the learning process, which poses a significant challenge for distance-based clustering in achieving good separation between different categories. To this end

Conversational AI

GAP-text2SQL: Learning contextual representations for semantic parsing with generation-augmented pre-training

Peng Shi, Patrick Ng, Zhiguo Wang, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

2021

Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-to-SQL

Conversational AI

Nlu-slot-constraints

Piyawat Lertvittayakumjorn, Daniele Bonadiman, Saab Mansour

2021

In goal-oriented dialogue systems, users provide information through slot values to achieve specific goals. Practically, some combinations of slot values can be invalid according to external knowledge. For example, a combination of “cheese pizza” (a menu item) and “oreo cookies” (a topping) from an input utterance “Can I order a cheese pizza with oreo cookies on top?” exemplifies such invalid combinations

Conversational AI

Real world noise benchmarks for natural language understanding

Sailik Sengupta, Jason Krone, Saab Mansour

2021

Intent Classification (IC) and Slot Labeling (SL) models, which form the basis of dialogue systems, often encounter noisy data in real-word environments. In this work, we investigate how robust IC/SL models are to noisy data. We collect and publicly release a test-suite for seven common noise types found in production human-to-bot conversations (abbreviations, casing, misspellings, morphological variants

Conversational AI

Information content of samples

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

2021

We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights. Though related, we show that these quantities have a qualitatively different behavior. We give efficient approximations of these quantities using a linearized

Machine learning

Efficiently summarizing text and graph encodings of multi-document clusters

Ramakanth Pasunuru, Mengwen Liu, Mohit Bansal, Sujith Ravi, Markus Dreyer

2021

This is the implementation of the paper Efficiently summarizing text and graph encodings of multi-document clusters.

Conversational AI

MinimaxFair - Convergent algorithms for (relaxed) minimax group fairness

Emily Diana, Wesley Gill, Michael Kearns, Krishnaram Kenthapadi, Aaron Roth

2021

MinimaxFair is a Python package for training ML models for (relaxed) minimax group fairness as discussed in Minimax group fairness: Algorithms and experiments. This repository contains python code for learning models that achieve minimax group fairness for both regression and classification tasks learning models that minimize error subject to relaxed group fairness constraints visualizing tradeoffs between

Machine learning

Symbolic music generation with transformer-GANs

Aashiq Muhamed, Liang Li, Xingjian Shi, Suri Yaddanapudi, Wayne Chi, Dylan Jackson, Rahul Suresh, Zachary Lipton, Alex Smola

2021

Transformers have emerged as the dominant approach in music literature for generating minute-long compositions with compelling musical structure. These models are trained by minimizing the negative log-likelihood (NLL) of the observed sequence autoregressively. Unfortunately, the quality of samples from these models tends to degrade significantly for long sequences, a phenomenon attributed to exposure bias

Conversational AI

PIZZA - a task-oriented semantic parsing dataset

Konstantine Arkoudas, Nicolas Guenon Des Mesnards, Melanie Rubino, Sandesh Swamy, Saarthak Khanna, Weiqi Sun

2021

Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot

Conversational AI

RAF: RAF accelerates deep learning frameworks

Cody Yu, Yizhi Liu, Haichen Shen, Zhen Jia, Jie Wang, Yuan Zhou, Animesh Jain, Yida Wang

2021

RAF is a deep learning compiler for training. Unlike existing DLCs, RAF accepts a forward model and in-house generates a training graph. Accordingly, RAF is able to systematically consolidate graph optimizations for performance, memory and distributed training. In addition, to catch up to the state-of-the-art performance with hand-crafted kernel libraries as well as tensor compilers, RAF proposes an operator

Machine learning

Progressive coordinate transforms for monocular 3D object detection

Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue

2021

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment. While significant progress has been achieved with expensive LiDAR point clouds, it poses a great challenge for 3D object detection given only a monocular image. While there exist different alternatives for tackling this problem, it is found that they are either equipped with heavy

Computer vision

Amazon multilingual counterfactual dataset (AMCD)

James O'Neill, Polina Rozenshtein, Ryuichi Kiryo, Motoko Kubota, Danushka Bollegala

2021

This repository contains a dataset described in the paper: I Wish I Would Have Loved This One, But I Didn’t – A Multilingual Dataset for Counterfactual Detection in Product Reviews. James O’Neill, Polina Rozenshtein, Ryuichi Kiryo, Motoko Kubota, Danushka Bollegala. EMNLP'21. arxiv version The dataset contains sentences from Amazon customer reviews (sampled from Amazon product review dataset) annotated

Information and knowledge management

Bias in open-ended language generation dataset (BOLD)

Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruksachatkun, Kai-Wei Chang, Rahul Gupta

2021

Bias in Open-ended Language Generation Dataset (BOLD) is a dataset to evaluate fairness in open-ended language generation in English language. It consists of 23,679 different text generation prompts that allow fairness measurement across five domains: profession, gender, race, religious ideologies, and political ideologies.

Conversational AI

DataTuner

Hamza Harkous, Isabel Groves, Amir Saffari

2021

In this work, we present DATATUNER, a neural, end-to-end data-to-text generation system that makes minimal assumptions about the data representation and target domain. We take a two-stage generation-reranking approach, combining a fine-tuned language model with a semantic fidelity classifier. Each component is learnt end-to-end without needing dataset-specific heuristics, entity delexicalization, or post-processing

Conversational AI

Generating self-contained and summary-centric question answer pairs via differentiable reward imitation learning

Li Zhou, Kevin Small, Yong Zhang, Sandeep Atluri

2021

Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset is used

Conversational AI

Search results

Work with us