Search - Amazon Science

Efficient deep learning inference on edge devices

Ziheng Jiang, Tianqi Chen, Mu Li

SysML 2018

2018

Deploying deep learning (DL) models on edge devices is getting popular nowadays. The huge diversity of edge devices, with both computation and memory constraints, however, make efficient deployment challenging. In this paper, we propose a two-stage pipeline that optimizes DL models on target devices. The first stage optimizes the inference workloads, and the second stage searches optimal kernel implementations

Machine learning

Learning fashion traits with label uncertainty

Assaf Neuberger, Sharon Alpert, Eli Alshan, Nati Bubis, Eduard Oks

CVPR 2018

2018

We consider the task of predicting subjective fashion traits from images. Specifically, we are interested in understanding which outfit actually better suites the user. Since these traits are highly subjective, they tend to be noisier. One solution is to annotate each example several times, but this makes it hard to collect large amounts of data.

Machine learning

Research challenges in building a voice-based artificial personal shopper

Nut Limsopatham, Oleg Rokhlenko, David Carmel

EMNLP 2018

2018

Recent advances in automatic speech recognition lead toward enabling a voice conversation between a human user and an intelligent virtual assistant. This provides a potential foundation for developing artificial personal shoppers for e-commerce websites, such as Alibaba, Amazon, and eBay. Personal shoppers are valuable to the on-line shops as they enhance user engagement and trust by promptly dealing with

Conversational AI

Online sparse linear regression

Dean Foster, Satyen Kale, Howard Karloff

STOC 2014

2018

We consider the online sparse linear regression problem, which is the problem of sequentially making predictions observing only a limited number of features in each round, to minimize regret with respect to the best sparse linear regressor, where prediction accuracy is measured by square loss. We give an inefficient algorithm that obtains regret bounded by O˜( √ T) after T prediction rounds. We complement

Machine learning

DEEQU - Data quality validation for machine learning pipelines

Sebastian Schelter, Philipp Schmidt, Tammo Rukat, Mario Kiessling, Andrey Taptunov, Felix Biessmann, Dustin Lange

NeurIPS 2018

2018

Modern machine learning (ML) systems are comprised of complex ML pipelines which typically have many implicit assumptions about the data they consume (e.g., about the scales of variables, the presence of missing values or the dictionary of categorical values). Violations of these assumptions can result in crashes or wrong predictions. We therefore present Deequ, a library that allows users to explicitly

Information and knowledge management

Record2Vec: Unsupervised representation learning for structured records

Adelene Sim, Andrew Borthwick

ICDM 2018

2018

Structured records – data with a fixed number of descriptive fields (or attributes) – are often represented by onehot encoded or term frequency-inverse document frequency (TF-IDF) weighted vectors. These vectors are typically sparse and long, and are inefficient in representing structured records. Here, we introduce Record2Vec, a framework for generating dense embeddings of structured records by training

Machine learning

"Deep" learning for missing value imputation in tables with non-numeric data

Felix Biessmann, David Salinas, Dustin Lange, Philipp Schmidt, Sebastian Schelter

CIKM 2018

2018

The success of applications that process data critically depends on the quality of the ingested data. Completeness of a data source is essential in many cases. Yet, most missing value imputation approaches suffer from severe limitations. They are almost exclusively restricted to numerical data, and they either offer only simple imputation methods or are difficult to scale and maintain in production. Here

Information and knowledge management

SpotLight: Detecting anomalies in streaming graphs

Dhivya Eswaran, Christos Faloutsos, Sudipto Guha, Nina Mishra

KDD 2018

2018

How do we spot interesting events from e-mail or transportation logs? How can we detect port scan or denial of service attacks from IP-IP communication data? In general, given a sequence of weighted, directed or bipartite graphs, each summarizing a snapshot of activity in a time window, how can we spot anomalous graphs containing the sudden appearance or disappearance of large dense subgraphs (e.g., near

Information and knowledge management

OpenTag: Open attribute extraction from product profiles

Guineng Zheng, Subhabrata Mukherjee, Xin Luna Dong, Feifei Li

KDD 2018

2018

Extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do

Information and knowledge management

CERES: Distantly supervised relation extraction from the semi-structured web

Colin Lockard, Xin Luna Dong, Arash Einolghozati, Prashant Shiralkar

VLDB 2018

2018

The web contains countless semi-structured websites, which can be a rich source of information for populating knowledge bases. Existing methods for extracting relations from the DOM trees of semi-structured webpages can achieve high precision and recall only when manual annotations for each website are available. Although there have been efforts to learn extractors from automatically generated labels, these

Information and knowledge management

Automating large-scale data quality verification

Sebastian Schelter, Dustin Lange, Philipp Schmidt, Meltem Celikel, Felix Biessmann

VLDB 2018

2018

Modern companies and institutions rely on data to guide every single business process and decision. Missing or incorrect information seriously compromises any decision process downstream. Therefore, a crucial, but tedious task for everyone involved in data processing is to verify the quality of their data. We present a system for automating the verification of data quality at scale, which meets the requirements

Information and knowledge management

How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures

Tobias Domhan

ACL 2018

2018

With recent advances in network architectures for Neural Machine Translation (NMT) recurrent models have effectively been replaced by either convolutional or self-attentional approaches, such as in the Transformer. While the main innovation of the Transformer architecture is its use of self-attentional layers, there are several other aspects, such as attention with multiple heads and the use of many attention

Machine learning

Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models

David Vilar

NAACL 2018

2018

In this paper we explore the use of Learning Hidden Unit Contribution for the task of neural machine translation. The method was initially proposed in the context of speech recognition for adapting a general system to the specific acoustic characteristics of each speaker. Similar in spirit, in a machine translation framework we want to adapt a general system to a specific domain. We show that the proposed

Machine learning

Leveraging data resources for cross linguistic information retrieval using statistical machine translation

Steve Sloto, Ann Clifton, Greg Hanneman, Patrick Porter, Donna Gates, A. Silja Hil

AMTA 2018

2018

Retail websites may provide customers with a localized user experience by allowing them to use a secondary language of preference. Automatic translation of user search queries is a crucial component of this experience. Several domain-adapted SMT systems for search query translation were trained, including language pairs for which smaller-than desired parallel resources were available, such as Polish-German

Machine learning

Persistent and robust execution of MAPF schedules in warehouses

Wolfgang Hönig, Scott Kiesel, Andrew Tinka, Joseph W. Durham, Nora Ayanian

IEEE Robotics and Automation Letters 2018

2018

Multi-Agent Path Finding (MAPF) is a well-studied problem in Artificial Intelligence that can be solved quickly in practice when using simplified agent assumptions. However, real-world applications, such as warehouse automation, require physical robots to function over long time horizons without collisions. We present an execution framework that can use existing single-shot MAPF planners and ensures robust execution in the presence of unknown or time-varying higher-order dynamic limits, unforeseen robot slow-downs, and unpredictable obstacle appearances.

Robotics

Buy it again: Modeling repeat purchase recommendations

Rahul Bhagat, Srevatsan Muralidharan, Alex Lobzhanidze, Shankar Vishwanath

KDD 2018

2018

Repeat purchasing, i.e., a customer purchasing the same product multiple times, is a common phenomenon in retail. As more customers start purchasing consumable products (e.g., toothpastes, diapers, etc.) online, this phenomenon has also become prevalent in e-commerce. However, in January 2014, when we looked at popular e-commerce websites, we did not find any customer-facing features that recommended products

Search and information retrieval

Scalable Hyperparameter Transfer Learning

Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cédric Archambeau

NeurIPS 2018

2018

Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process (GP) regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, GP-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs

Machine learning

Mutual information guided distillation for transfer learning

Sung-soo Ahn, Shell Hu, Zhenwen Dai, Andreas Damianou, Neil Lawrence

NeurIPS 2018

2018

We consider the teacher-student framework for knowledge transfer, where the goal is to improve learning of a “student” neural network, given a “teacher” neural network pretrained on the same or a similar task. The majority of existing approaches for distilling knowledge from a teacher network to a student network rely on matching either activations or handcrafted features from the teacher network. Instead

Machine learning

SeCSeq: Semantic coding for sequence-to-sequence based extreme multi-label classification

Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon, Yiming Wang

NeurIPS 2018

2018

Extreme multi-label classification (XMC) aims at assigning to an instance the most relevant subset of labels from a colossal label set. There have been some success in formulating the multi-label problem as sequence-to-sequence (Seq2Seq) learning, where the positive class labels of each input instance are used as the corresponding output sequence. Seq2Seq methods, nonetheless, have not yet been scalable

Machine learning

Facilitating Bayesian continual learning by natural gradients and Stein gradients

Yu Chen, Tom Diethe, Neil Lawrence

NeurIPS 2018

2018

Continual learning aims to enable machine learning models to learn a general solution space for past and future tasks in a sequential manner. Conventional models tend to forget the knowledge of previous tasks while learning a new task, a phenomenon known as catastrophic forgetting. When using Bayesian models in continual learning, knowledge from previous tasks can be retained in two ways: (i) posterior

Machine learning

Search results

Work with us