Search - Amazon Science

On symmetries in variational Bayesian neural nets

Richard Kurle, Tim Januschowski, Jan Gasthaus, Yuyang (Bernie) Wang

NeurIPS 2021 Workshop on Bayesian Deep Learning

2021

Probabilistic inference of Neural Network parameters is challenging due to the highly multi-modal likelihood functions. Most importantly, the permutation invariance of the neurons of the hidden layers renders the likelihood function unidentifiable with a factorial number of equivalent (symmetric) modes, independent of the data. We show that variational Bayesian methods that approximate the (multi-modal)

Machine learning

Forecasting with trees

Tim Januschowski, Yuyang (Bernie) Wang, Kari Torkkola, Timo Erkkilä, Hilaf Hasson, Jan Gasthaus

International Journal of Forecasting

2021

The prevalence of approaches based on gradient boosted trees among the top contestants in the M5 competition is potentially the most eye-catching result. Tree-based methods out-shone other solutions, in particular deep learning-based solutions. The winners in both tracks of the M5 competition heavily relied on them. This prevalence is even more remarkable given the dominance of other methods in the literature

Machine learning

Efficient query processing techniques for next-page retrieval

Joel Mackenzie, Matthias Petri, Alistair Moffat

Information Retrieval Journal

2021

In top-k ranked retrieval the goal is to efficiently compute an ordered list of the highest scoring k documents according to some stipulated similarity function such as the well-known BM25 approach. In most implementation techniques a min-heap of size k is used to track the top scoring candidates. In this work we consider the question of how best to retrieve the second page of search results, given that

Search and information retrieval

Towards safer continuous infrastructure-as-code deployments

Henrique Lima

arXiv

2021

As cloud computing resources become more adopted, the infrastructures in which they are used naturally grow in the amount of resources and overall complexity, becoming harder to manage. Infrastructure-as-Code (IaC) is presented as a solution to this problem, allowing developers to manage and provision these cloud resources programmatically. The infrastructure is then maintained through a code base, allowing

Cloud and systems

Kronecker factorization for preventing catastrophic forgetting in large-scale medical entity linking

Denis Jered McInerney, Chris (Luyang) Kong, Kristjan Arumae, Byron Wallace, Parminder Bhatia

NeurIPS 2021 Workshop on Machine Learning in Public Health

2021

Multi-task learning is useful in NLP because it is often practically desirable to have a single model that works across a range of tasks. In the medical domain, sequential training on tasks may sometimes be the only way to train models, either because access to the original (potentially sensitive) data is no longer available, or simply owing to the computational costs inherent to joint retraining. A major

Conversational AI

Hear from AWS Machine Learning Summit speakers

Staff writer

June 1, 2021

The event is over, but Amazon Science interviewed each of the six speakers within the Science of Machine Learning track. See what they had to say.

Machine learning

Probabilistic hierarchical forecasting with deep poisson mixtures

Kin G. Olivares, O. Nganba Meetei, Ruijun Ma, Rohan Reddy, Mengfei Cao, Lee Dicker

NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications

2021

Hierarchical forecasting problems arise when time series have a natural group structure, and predictions at multiple levels of aggregation and disaggregation across the groups are needed. In such problems, it is often desired to satisfy the aggregation constraints in a given hierarchy, referred to as hierarchical coherence in the literature. Maintaining hierarchical coherence while producing accurate forecasts

Machine learning

Towards realistic single-task continuous learning research for NER

Justin Payan, Yuval Merhav, He Xie, Satyapriya Krishna, Anil Ramakrishna, Anil Ramakrishna, Mukund Sridhar, Rahul Gupta

EMNLP 2021

2021

There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL settings, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datasets, study

Conversational AI

GAN-control: Explicitly controllable GANs

Alon Shoshan, Nadav Bhonker, Igor Kviatkovsky, Gérard Medioni

2021

We present a framework for training GANs with explicit control over generated facial images. We are able to control the generated image by settings exact attributes such as age, pose, expression, etc. Most approaches for manipulating GAN-generated images achieve partial control by leveraging the latent space disentanglement properties, obtained implicitly after standard GAN training. Such methods are able

Computer vision

Maximum weight independent set vehicle routing instances

Yuanyuan Dong, Andrew V. Goldberg, Alexander Noe, Nikos Parotsidis, Mauricio G. C. Resende, Quico Spaen

2021

We present a set of new instances of the maximum weight independent set problem. These instances are derived from a real-world vehicle routing problem and are challenging to solve in part because of their large size. We present instances with up to 881 thousand nodes and 383 million edges.

Computer vision

CrossNorm (CN) and SelfNorm (SN)

Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris Metaxas

2021

This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm (CN) and SelfNorm (SN), two simple, effective, and complementary normalization techniques to improve generalization robustness under distribution shifts.

Computer vision

Flexible model aggregation for quantile regression

Rasool Fakoor, Taesup Kim, Jonas Mueller, Alex Smola, Ryan Tibshirani

2021

Quantile regression is a fundamental problem in statistical learning motivated by the need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive. For instance, epidemiological forecasts, cost estimates, and revenue predictions all benefit from being able to quantify the range of possible values accurately. As such, many models have been developed for this

Machine learning

DGL-LifeSci

Mufei Li, Jinjing Zhou, Jiajing Hu, Wenxuan Fan, Yangkang Zhang, Yaxin Gu, George Karypis

2021

Graph neural networks (GNNs) constitute a class of deep learning methods for graph data. They have wide applications in chemistry and biology, such as molecular property prediction, reaction prediction, and drug−target interaction prediction. Despite the interest, GNN-based modeling is challenging as it requires graph data preprocessing and modeling in addition to programming and deep learning. Here, we

Machine learning

A baseline for few-shot image classification

Guneet Singh Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto

2021

Fine-tuning a deep network trained with the standard cross-entropy loss is a strong baseline for few-shot learning. When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, TieredImageNet, CIFAR-FS and FC-100 with the same hyper-parameters. The simplicity of this approach enables us to demonstrate the first few-shot learning results on the

Computer vision

Commonsense-Dialogues

Pei Zhou, Karthik Gopalakrishnan, Behnam Hedayatnia, Seokhwan Kim, Jay Pujara, Xiang Ren, Yang Liu, Dilek Hakkani-Tür

2021

We present Commonsense-Dialogues, a crowdsourced dataset of ~11K dialogues grounded in social contexts involving utilization of commonsense. The social contexts used were sourced from the train split of the SocialIQA dataset, a multiple-choice question-answering based social commonsense reasoning benchmark. For the collection of the Commonsense-Dialogues dataset, each Turker was presented a social context

Conversational AI

Sentence representations learning with transformers

Dejiao Zhang, Wei Xiao, Henghui Zhu, Xiaofei Ma, Andrew O. Arnold, Shang-Wen Li, Ramesh Nallapati, Bing Xiang

2021

Despite profound successes, contrastive representation learning relies on carefully designed data augmentations using domain-specific knowledge. This challenge is magnified in natural language processing, where no general rules exist for data augmentation due to the discrete nature of natural language. We tackle this challenge by presenting a Virtual augmentation Supported Contrastive Learning of sentence

Conversational AI

Question answering NLU

Mahdi Namazifar, Alexandros Papangelis, Gokhan Tur, Dilek Hakkani-Tür

2021

Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering, leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of training an intent classifier or a slot tagger, for example, we can ask the model intent- and slot-related questions in natural language: Context : I'm looking for a cheap flight to Boston. Question: Is the user looking

Conversational AI

Relaxed adaptive projection

Sergul Aydore, William Brown, Michael Kearns, Krishnaram Kenthapadi, Luca Melis, Aaron Roth, Ankit Siva

2021

We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like k-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation, and then attempts to find the synthetic dataset that most closely matches

Security, privacy, and abuse prevention

Blending anti-aliasing into vision transformer

Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia

2021

The transformer architectures, based on self-attention mechanism and convolution-free design, recently found superior performance and booming applications in computer vision. However, the discontinuous patch-wise tokenization process implicitly introduces jagged artifacts into attention maps, arising the traditional problem of aliasing for vision transformers. Aliasing effect occurs when discrete patterns

Computer vision

Gender profession data

Xing Niu, Georgiana Dinu, Prashant Mathur, Anna Currey

2021

The training data used in NMT is rarely controlled with respect to specific attributes, such as word casing or gender, which can cause errors in translations. We argue that predicting the target word and attributes simultaneously is an effective way to ensure that translations are more faithful to the training data distribution with respect to these attributes. Experimental results on two tasks, uppercased

Conversational AI

Search results

Work with us