Search - Amazon Science

Bayesian optimization by density ratio estimation

Louis Tiao, Aaron Klein, Matthias Seeger, Cédric Archambeau, Edwin Bonilla, Fabio Ramos

NeurIPS 2020 Workshop on Meta-learning

2020

Bayesian optimization (BO) is among the most effective and widely used blackbox optimization methods. BO proposes solutions according to an explore-exploit trade-off criterion encoded in an acquisition function, many of which are derived from the posterior predictive of a probabilistic surrogate model. Prevalent among these is the expected improvement (EI). Naturally, the need to ensure analytical tractability

Machine learning

Lightweight LLM for converting text to structured data

Karim Bouyarmane

February 6, 2025

Novel training procedure and decoding mechanism enable model to outperform much larger foundation model prompted to perform the same task.

Conversational AI

The history of Amazon's recommendation algorithm

Larry Hardesty

November 22, 2019

Jeff Wilke, who was then Amazon's consumer worldwide CEO, delivering a keynote presentation at re:MARS 2019

In 2017, when the journal IEEE Internet Computing was celebrating its 20th anniversary, its editorial board decided to identify the single paper from its publication history that had best withstood the “test of time”. The honor went to a 2003 paper called “Amazon.com Recommendations: Item-to-Item Collaborative Filtering”, by then Amazon researchers Greg Linden, Brent Smith, and Jeremy York.

Search and information retrieval

Dialogue Boost: How Amazon is using AI to enhance TV and movie dialogue

Yuzhou Liu, Trausti Kristjansson

December 10, 2025

New audio-processing technology is making entertainment more accessible for millions of viewers.

Conversational AI

EACL 2023: Language processing at the dawn of the LLM era

Larry Hardesty

May 5, 2023

Prompt engineering, adaptation of language models, and attempts to remediate large language models’ (LLMs’) “hallucinations” point toward future research in the field.

Conversational AI

Extreme model compression for on-device natural language understanding

Kanthashree Mysore Sathyendra, Samridhi Choudhary, Leah Nicolich-Henkin

COLING 2020

2020

In this paper, we propose and experiment with techniques for extreme compression of neural natural language understanding (NLU) models, making them suitable for execution on resource-constrained devices. We propose a task-aware, end-to-end compression approach that performs word-embedding compression jointly with NLU task learning. We show our results on a large-scale, commercial NLU system trained on a

Conversational AI

Huseyin Topaloglu receives Cornell endowed faculty chair

Staff writer

March 18, 2022

The Howard and Eleanor Morgan Professor is awarded to a Cornell faculty member who has made meaningful contributions to operations research.

Operations research and optimization

Amazon Kids links up with Boston Children’s Hospital’s Digital Wellness Lab

Mariana Lenharo

May 24, 2021

A child sits in a chair at a kitchen counter, wearing headphones while watching a Fire tablet

Science-based recommendations from the Digital Wellness Lab could inform the development of digital products that help children.

Interpretable ensemble models improve product retrieval

Nurendra Choudhary

July 3, 2024

Gradient-boosted decision trees aggregate model outputs, and Shapley values help identify the most useful models for the ensemble.

Search and information retrieval

Automating large-scale data quality verification

Sebastian Schelter, Dustin Lange, Philipp Schmidt, Meltem Celikel, Felix Biessmann

VLDB 2018

2018

Modern companies and institutions rely on data to guide every single business process and decision. Missing or incorrect information seriously compromises any decision process downstream. Therefore, a crucial, but tedious task for everyone involved in data processing is to verify the quality of their data. We present a system for automating the verification of data quality at scale, which meets the requirements

Information and knowledge management

Compressed video action recognition

Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alex Smola, Philipp Krähenbühl

CVPR 2018

2018

Training robust deep video representations has proven to be much more challenging than learning deep image representations. This is in part due to the enormous size of raw video streams and the high temporal redundancy; the true and interesting signal is often drowned in too much irrelevant data. Motivated by that the superfluous information can be reduced by up to two orders of magnitude by video compression

Computer vision

signSGD: compressed optimisation for non-convex problems

Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Animashree Anandkumar

ICML 2018

2018

Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. SIGNSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative `1/`2 geometry of gradients

Machine learning

Direct optimization of F-measure for retrieval-based personal question answering

Rasool Fakoor, Amanjit Kainth, Siamak Shakeri, Christopher Winestock, Abdel-Rahman Mohamed, Ruhi Sarikaya

SLT 2018

2018

Recent advances in spoken language technologies and the introduction of many customer facing products, have given rise to a wide customer reliance on smart personal assistants for many of their daily tasks. In this paper, we present a system to reduce users’ cognitive load by extending personal assistants with long-term personal memory where users can store and retrieve by voice, arbitrary pieces of information

Conversational AI

Question type guided attention in visual question answering

Yang Shi, Tommaso Furlanello, Sheng Zha, Animashree Anandkumar

ECCV 2018

2018

Visual Question Answering (VQA) requires integration of feature maps with drastically different structures. Image descriptors have structures at multiple spatial scales, while lexical inputs inherently follow a temporal sequence and naturally cluster into semantically different question types. A lot of previous works use complex models to extract feature representations but neglect to use high-level information

Computer vision

Multiplicative tree-structured long short-term memory networks for semantic representations

Nam Khanh Tran, Weiwei Cheng

NAACL 2018

2018

Tree-structured LSTMs have shown advantages in learning semantic representations by exploiting syntactic information. Most existing methods model tree structures by bottomup combinations of constituent nodes using the same shared compositional function and often making use of input word information only. The inability to capture the richness of compositionality makes these models lack expressive power.

Conversational AI

A multi-objective rule optimizer with an application to risk management

Pietari Pulkkinen, Neetesh Tiwari, Akhil Kumar, Christopher Jones, Yan Zhang

ICMLA 2018

2018

Managing risk is important to any E-commerce merchant. Various machine learning (ML) models combined with a rule set as the decision layer is a common practice to manage the risks. Unlike the ML models that can be automatically refreshed periodically based on new risk patterns, rules are generally static and rely on manual updates. To tackle that, this paper presents a data-driven and automated rule optimization

Operations research and optimization

LSTM-based Whisper Detection

Zeynab Raeesy, Kellen Gillespie, Chengyuan Ma, Thomas Drugman, Jiacheng Gu, Roland Maas, Ariya Rastrow, Björn Hoffmeister

SLT 2018

2018

This article presents a whisper speech detector in the far-field domain. The proposed system consists of a long short-term memory (LSTM) neural network trained on log-filterbank energy (LFBE) acoustic features. This model is trained and evaluated on recordings of human interactions with voice-controlled, far-field devices in whisper and normal phonation modes. We compare multiple inference approaches for

Conversational AI

Automated reasoning at Amazon: A conversation

Larry Hardesty

August 8, 2022

To mark the occasion of the eighth Federated Logic Conference (FloC), Amazon’s Byron Cook, Daniel Kröning, and Marijn Heule discussed automated reasoning’s prospects.

Automated reasoning

The path to carbon reductions in high-growth economic sectors

Miguel Jaller

August 1, 2022

Confronting climate change requires the participation of governments, companies, academics, civil-society organizations, and the public.

Sustainability

Singing synthesis: With a little help from my attention

Orazio Angelini, Alexis Moinet, Kayoko Yanagisawa, Thomas Drugman

Interspeech 2020

2020

We present UTACO, a singing synthesis model based on an attention-based sequence-to-sequence mechanism and a vocoder based on dilated causal convolutions. These two classes of models have significantly affected the field of text-to-speech, but have never been thoroughly applied to the task of singing synthesis. UTACO demonstrates that attention can be successfully applied to the singing synthesis field

Conversational AI

Search results

Work with us