Search - Amazon Science

Progressive coordinate transforms for monocular 3D object detection

Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue

2021

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment. While significant progress has been achieved with expensive LiDAR point clouds, it poses a great challenge for 3D object detection given only a monocular image. While there exist different alternatives for tackling this problem, it is found that they are either equipped with heavy

Computer vision

Amazon multilingual counterfactual dataset (AMCD)

James O'Neill, Polina Rozenshtein, Ryuichi Kiryo, Motoko Kubota, Danushka Bollegala

2021

This repository contains a dataset described in the paper: I Wish I Would Have Loved This One, But I Didn’t – A Multilingual Dataset for Counterfactual Detection in Product Reviews. James O’Neill, Polina Rozenshtein, Ryuichi Kiryo, Motoko Kubota, Danushka Bollegala. EMNLP'21. arxiv version The dataset contains sentences from Amazon customer reviews (sampled from Amazon product review dataset) annotated

Information and knowledge management

Bias in open-ended language generation dataset (BOLD)

Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruksachatkun, Kai-Wei Chang, Rahul Gupta

2021

Bias in Open-ended Language Generation Dataset (BOLD) is a dataset to evaluate fairness in open-ended language generation in English language. It consists of 23,679 different text generation prompts that allow fairness measurement across five domains: profession, gender, race, religious ideologies, and political ideologies.

Conversational AI

DataTuner

Hamza Harkous, Isabel Groves, Amir Saffari

2021

In this work, we present DATATUNER, a neural, end-to-end data-to-text generation system that makes minimal assumptions about the data representation and target domain. We take a two-stage generation-reranking approach, combining a fine-tuned language model with a semantic fidelity classifier. Each component is learnt end-to-end without needing dataset-specific heuristics, entity delexicalization, or post-processing

Conversational AI

Generating self-contained and summary-centric question answer pairs via differentiable reward imitation learning

Li Zhou, Kevin Small, Yong Zhang, Sandeep Atluri

2021

Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset is used

Conversational AI

EmBERT: A transformer model for embodied, language-guided visual task completion

Alessandro Suglia, Qiaozi (QZ) Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme

2021

We present Embodied BERT (EmBERT), a transformer-based model which can attend to high-dimensional, multi-modal inputs across long temporal horizons for language-conditioned task completion. Additionally, we bridge the gap between successful object-centric navigation models used for non-interactive agents and the language-guided visual task completion benchmark, ALFRED, by introducing object navigation targets

Conversational AI

A statistical extension of byte-pair Encoding

David Vilar, Marcello Federico

2021

Sub-word segmentation is currently a standard tool for training neural machine translation (MT) systems and other NLP tasks. The goal is to split words (both in the source and target languages) into smaller units which then constitute the input and output vocabularies of the MT system. The aim of reducing the size of the input and output vocabularies is to increase the generalization capabilities of the

Conversational AI

Adversarial robustness with non-uniform perturbations

Ecenaz Erdemir, Jeffrey Bickford, Luca Melis, Luca Melis, Sergul Aydore

2021

The key idea of our proposed approach is to enable non-uniform perturbations that can adequately represent these feature dependencies during adversarial training. We propose using characteristics of the empirical data distribution, both on correlations between the features and the importance of the features themselves. Using experimental datasets for malware classification, credit risk prediction, and spam

Machine learning

Attention-based contextual language modeling adaptation

Richard Diehl Martinez, Scott Novotney, Ivan Bulyko, Ariya Rastrow, Andreas Stolcke, Ankur Gandhe

2021

This project provides the source to reproduce the main methods of the paper "Attention-based contextual language model adaptation for speech recognition", submitted to ACL 2021. This codebase also implements additional functionality that was not explicitly described in the paper, such as experimental methods for combining multiple types of non-linguistic context together (e.g. geo-location, and datetime

Conversational AI

Gender-filtered self-training (GFST) for NMT

Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu

2021

We propose gender-filtered self-training (GFST) to improve gender translation accuracy on unambiguously gendered inputs. Our GFST approach uses a source monolingual corpus and an initial model to generate gender-specific pseudo-parallel corpora which are then filtered and added to the training data. We evaluate GFST on translation from English into five languages, finding that it improves gender accuracy

Conversational AI

FeatGraph: Sparse kernels for GNNs based on TVM

Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang

2021

This paper proposes FeatGraph to accelerate GNN workloads by co-optimizing graph traversal and feature dimension computation. FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge. FeatGraph incorporates optimizations for graph traversal into the sparse templates and

Cloud and systems

Proteno

Shubhi Tyagi, Antonio Bonafonte, Jaime Lorenzo Trueba, Javier Latorre

2021

Developing Text Normalization (TN) systems for Text-to-Speech (TTS) on new languages is hard. We propose a novel architecture to facilitate it for multiple languages while using data less than 3% of the size of the data used by the state of the art results on English. We treat TN as a sequence classification problem and propose a granular tokenization mechanism that enables the system to learn majority

Conversational AI

LUMINOUS: Indoor scene generation for embodied AI challenges

Yizhou Zhao, Kaixiang Lin, Zhiwei Jia, Qiaozi (QZ) Gao, Govind Thattai, Jesse Thomason, Gaurav Sukhatme

2021

Luminous is a framework for testing the performance of embodied AI (EAI) models in indoor tasks. Generally, we integrate different kind of functionalities into this repository that are related to evaluate EAI performance for indoor tasks. The Indoor Scene Synthesis module provides different methods for synthesize randomized indoor scenes that be visualized in Unity Engine. The Luminous for Alfred offers

Computer vision

TANL: Structured prediction as translation between augmented natural languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

2021

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative

Conversational AI

CrossCLR: Cross-modal contrastive learning for multi-modal video representations

Mohammadreza Zolfaghari, Yi Zhu, Peter Gehler, Thomas Brox

2021

Contrastive learning allows us to flexibly define powerful losses by contrasting positive pairs from sets of negative samples. Recently, the principle has also been used to learn cross-modal embeddings for video and text, yet without exploiting its full potential. In particular, previous losses do not take the intra-modality similarities into account, which leads to inefficient embeddings, as the same content

Computer vision

Uniform sampling over episode difficulty

Sébastien M. R. Arnold, Guneet Singh Dhillon, Avinash Ravichandran, Stefano Soatto

2021

Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data. Despite its success, episodic training remains largely understudied, prompting us to ask the question: what is the best way to sample episodes? In this paper, we first propose a method to approximate episode sampling distributions based on their difficulty. Building on this method, we perform

Computer vision

Amazon DenseClus

Charles Frenzel, Baichuan Sun, Eden Duthie, Yin Song

2021

DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN. Allowing for both categorical and numerical data, DenseClus makes it possible to incorporate all features in clustering.

Machine learning

DSTC10 Track 2 - Knowledge-grounded task-oriented dialogue modeling on spoken conversations

Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Behnam Hedayatnia, Karthik Gopalakrishnan, Dilek Hakkani-Tür

2021

A lot of recent work in dialogue modeling has been on written conversations, partly because of available data sets. However, written dialogues are not sufficient to fully capture the nature of spoken conversations as well as the potential effect of speech recognition errors on practical spoken dialogue systems. This challenge track aims to provide a new benchmark on spoken task-oriented conversations. We

Conversational AI

Learning better visual dialog agents with pretrained visual-linguistic representation

Tao Tu, Qing Ping, Govind Thattai, Gokhan Tur, Prem Natarajan

2021

GuessWhat?! is a visual dialog guessing game which incorporates a Questioner agent that generates a sequence of questions, while an Oracle agent answers the respective questions about a target object in an image. Based on this dialog history between the Questioner and the Oracle, a Guesser agent makes a final guess of the target object. While previous work has focused on dialogue policy optimization and

Computer vision

Question answering using web lists

Anoop Katti, Kai Hui, Adrià de Gispert, Hagen Fuerstenau

2021

This repository contains the ListQA datasets described in the paper - Question Answering using Web Lists. Datasets, NQWebList and GQWebList, use a subset of questions from Natural Questions and GooAQ respectively. To build these datasets, each annotator was shown a question and a relevant URL from the web and was asked to annotate the list answer on the URL, if it exists. For annotating a list, the annotators

Conversational AI

Search results

Work with us