Search - Amazon Science

ReviseSum

Griffin Adams, Han-Chin Shing, Qing Sun, Christopher Winestock, Kathleen McKeown, Noémie Elhadad

2022

In real-world scenarios with naturally occurring datasets, reference summaries are noisy and may contain information that cannot be inferred from the source text. On large news corpora, removing low quality samples has been shown to reduce model hallucinations. Yet, for smaller, and/or noisier corpora, filtering is detrimental to performance. To improve reference quality while retaining all data, we propose

Conversational AI

Pairwise fairness for ordinal regression

Matthaus Kleindessner, Samira Samadi, Bilal Zafar, Krishnaram Kenthapadi, Chris Russell

2022

We initiate the study of fairness for ordinal regression. We adapt two fairness notions previously considered in fair ranking and propose a strategy for training a predictor that is approximately fair according to either notion. Our predictor has the form of a threshold model, composed of a scoring function and a set of thresholds, and our strategy is based on a reduction to fair binary classification for

Machine learning

TGL: A general framework for temporal graph training on billion-scale graphs

hongkuan zhou, Da Zheng, Israt Nisa, Vassilis N. Ioannidis, Xiang Song, George Karypis

2022

Many real world graphs contain time domain information. Temporal Graph Neural Networks capture temporal information as well as structural and contextual information in the generated dynamic node embeddings. Researchers have shown that these embeddings achieve state-of-the-art performance in many different tasks. In this work, we propose TGL, a unified framework for large-scale offline Temporal Graph Neural

FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models

Rakesh Chada, Pradeep Natarajan

2022

The task of learning from only a few examples (called a few-shot setting) is of key importance and relevance to a real-world setting. For question answering (QA), the current state-of-the-art pre-trained models typically need fine-tuning on tens of thousands of examples to obtain good results. Their performance degrades significantly in a few-shot setting (< 100 examples). To address this, we propose a

Conversational AI

FDB: Fraud Dataset Benchmark

Prince Grover, Zheng Li, Julia Xu, Justin Tittelfitz, Anqi Cheng, Jakub Zablocki, Jianbo Liu, Hao Zhou

2022

The Fraud Dataset Benchmark (FDB) is a compilation of publicly available datasets relevant to fraud detection (arXiv link). The FDB aims to cover a wide variety of fraud detection tasks, ranging from card not present transaction fraud, bot attacks, malicious traffic, loan risk and content moderation. The Python based data loaders from FDB provide dataset loading, standardized train-test splits and performance

Security, privacy, and abuse prevention

Crowd coachable recommendations (CCRec)

Yifei Ma

2022

Codes for zero-shot recommendations and subsequent online learning and exploration with crowd-sourced preference labels.

Machine learning

ConTurE (Conversational turns evaluation)

Sarik Ghazarian, Behnam Hedayatnia, Alexandros Papangelis, Yang Liu, Dilek Hakkani-Tür, Chulaka Gunasekara, Seokhwan Kim, Luis Fernando D'Haro, Abhinav Rastogi, Yun-Nung Chen, Mihail Eric, Karthik Gopalakrishnan, Chao-Wei Huang, Jinchao Li, Qi Zhu, Lingxiao Luo, Lars Liden, Kaili Huang, Shahin Shayandeh, Runze Liang, Baolin Peng, Zheng Zhang, Swadheen Shukla, Minlie Huang, Jianfeng Gao, Shikib Mehri, Yulan Feng, Carla Gordon, Seyed Hossein Alavi, David Traum, Maxine Eskenazi , Ahmad Beirami, Eunjoon Cho, Paul A. Crook, Ankita De, Alborz Geramifard, Satwik Kottur, Seungwhan Moon, Shivani Poddar , Rajen Subba

2022

This README describes ConTurE, a turn annotated version of the publicly released dataset from the Interactive Evaluation of Dialog track of DSTC9 (Gunasekara et al., 2020; Mehri et al. 2020). We sampled 119 dialogs from the original dataset for turn level annotation where each Chatbot turn within the dialog was annotated for response quality. We hope releasing this dataset will benefit the community for

Conversational AI

Multimodal semi-supervised learning for text recognition

Aviad Aberdam, Roy Ganz, Shai Mazor, Ron Litman

2022

Until recently, the number of public real-world text images was insufficient for training scene text recognizers. Therefore, most modern training methods rely on synthetic data and operate in a fully supervised manner. Nevertheless, the amount of public real-world text images has increased significantly lately, including a great deal of unlabeled data. Leveraging these resources requires semi-supervised

Computer vision

Semantic parsing as abstractive question answering

Wenting Zhao, Konstantine Arkoudas, Weiqi Sun, Claire Cardie

2022

This repository contains two abstractive question answering datasets that are reduced from task-oriented parsing (TOP) datasets. They are used in the semantic parsing as abstractive question answering work. We also provide the related data processing scripts.

Conversational AI

Document level MT metrics

Giorgos Vernikos, Brian Thompson, Prashant Mathur, Marcello Federico

2022

We present a very simple method for extending pretrained machine translation metrics to incorporate document-level context. We apply our method to four popular metrics: BERTScore, Prism, COMET, and the reference-free metric COMET-QE. We evaluate our document-level metrics on the MQM annotations from the WMT 2021 metrics shared task and find that the document-level metrics outperform their sentence-level

Conversational AI

Overfitting in Bayesian optimization: An empirical study and early-stopping solution

Anastasia Makarova, Huibin Shen, Valerio Perrone, Aaron Klein, Jean Baptiste Faddoul, Andreas Krause, Matthias Seeger, Cédric Archambeau

2022

In this work, we propose an effective and intuitive termination criterion for BO that automatically stops the procedure if it is sufficiently close to the global optimum. Our key insight is that the discrepancy between the true objective (predictive performance on test data) and the computable target (validation performance) suggests stopping once the suboptimality in optimizing the target is dominated

Machine learning

Resource constrained naturalized semantic parsing

Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

2022

Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant. State-of-the-art (SOTA) semantic parsers are seq2seq architectures based on large language models that have been pretrained on vast amounts of text. To better leverage that pretraining, recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves

Conversational AI

Alexa Voice Service (AVS)

Ravi Chemudugunta, Raj Palkar, James Powell

2022

The Alexa Voice Service (AVS) enables developers to integrate Alexa directly into their products, bringing the convenience of voice control to any connected device. AVS provides developers with access to a suite of resources to build Alexa-enabled products, including APIs, hardware development kits, software development kits, and documentation.

Conversational AI

TextAdaIN: Paying attention to shortcut learning in text recognizers

Oren Nuriel, Sharon Fogel, Ron Litman

2022

Leveraging the characteristics of convolutional layers, neural networks are extremely effective for pattern recognition tasks. However in some cases, their decisions are based on unintended information leading to high performance on standard benchmarks but also to a lack of generalization to challenging testing conditions and unintuitive failures. Recent work has termed this ”shortcut learning” and addressed

Computer vision

Injecting domain knowledge in language models for task-oriented dialogue systems

Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour

2022

Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and

Conversational AI

The FoodOrdering dataset

Melanie Rubino, Nicolas Guenon Des Mesnards, Uday Shah, Nanjiang Jiang, Weiqi Sun, Konstantine Arkoudas

2022

The FoodOrdering dataset is a task-oriented parsing dataset in the food-ordering domain with utterances and annotations derived from the menus of five venues characteristic of that business vertical: burgers, burritos, coffees, pizzas, and subs.

Conversational AI

Alexa Teacher Model (AlexaTM 20B)

Saleh Soltan, Shankar Ananthakrishnan, Jack G. M. FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

2022

A 20 billion parameter multilingual seq2seq model called Alexa Teacher Model (AlexaTM 20B), which achieves state-of-the-art (SOTA) performance on 1-shot summarization tasks, outperforming a much larger 540B PaLM decoder model. AlexaTM 20B also achieves SOTA in 1-shot machine translation, especially for low-resource languages, across almost all language pairs supported by the model (Arabic, English, French

Conversational AI

Listen know spell dataset

Nilaksh Das, Monica Sunkara, Dhanush Bekal, Duen Horng Chau, Sravan Bodapati, Katrin Kirchhoff

2022

Automatic speech recognition (ASR) is increasingly being used in specialized domains such as medical ASR and news transcription. Owing to the lack of high quality annotated speech data in such domains, off-the-shelf models are commonly employed by fine-tuning on domain-specific data. This poses a significant challenge in transcribing long-tail expressions and out-of-vocabulary (OOV) named entities. On the

Conversational AI

Answer consolidation

Wenxuan Zhou, Qiang Ning, Heba Elfardy, Kevin Small, Muhao Chen

2022

Current question answering (QA) systems primarily consider the single-answer scenario, where each question is assumed to be paired with one correct answer. However, in many real-world QA applications, multiple answer scenarios arise where consolidating answers into a comprehensive and non-redundant set of answers is a more efficient user interface. In this paper, we formulate the problem of answer consolidation

Conversational AI

Bias bounties

Ira Globus-Harris, Michael Kearns, Aaron Roth

2022

Project Description This is a test framework for the bias bounties project. Getting Started as a Bounty Hunter If you are interacting with this codebase as a "bounty hunter", you'll need to have a way to run Jupyter notebooks. The easiest way to do this is to download Anaconda, which will also manage all of your python packages for you. See here for installation instructions: https://docs.anaconda.com/anaconda

Machine learning

Search results

Work with us