Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

Virtual try-all: Visualizing any product in any personal setting

April 16, 2024

First model to work across a wide range of products uses a second U-Net encoder to capture fine-grained product details.

Computer vision
A quick guide to Amazon's 20+ papers at ICASSP 2024

April 11, 2024

This year’s papers address topics such as speech enhancement, spoken-language understanding, dialogue, paralinguistics, and pitch estimation.

Conversational AI
Using Amazon web traffic to track the eclipse

April 11, 2024

An animation that projects traffic fluctuations onto the U.S. map offers an example of how the Supply Chain Optimization Technologies team uses data visualization to glean insights.

Operations research and optimization
Conference calendar
- ICLR 2024
  
  Machine learning
  
  May 7 - 11, 2024
- The Web Conference 2024
  
  Information and knowledge management
  
  May 13 - 17, 2024
- LREC-COLING 2024
  
  Conversational AI
  
  May 20 - 25, 2024

Image grid shows several of the recipients of the 2023 fall Amazon Research Awards

Amazon Research Awards recipients announced

April 26, 2024

Awardees, who represent 51 universities in 15 countries, have access to Amazon public datasets, along with AWS AI/ML services and tools.

Meet the recipients

De-noised vision-language fusion guided by visual cues for e-commerce product search

Zhizhang Hu, Shasha Li, Ming Du, Erica Aduh, Arnab Dhua, Doug Gray

CVPR 2024 Workshop on Multimodal Learning and Applications

2024

In e-commerce applications, vision-language multimodal transformer models play a pivotal role in product search. The key to successfully training a multimodal model lies in the alignment quality of image-text pairs in the dataset. However, the data in practice is often automatically collected with minimal manual intervention. Hence the alignment of image-text pairs is far from ideal. In e-commerce, this

Computer vision
Benchmarking zero-shot recognition with vision-language models: Challenges on granularity and specificity

Zhenlin Xu, Yi Zhu, Tiffany Deng, Abhay Mittal, Yanbei Chen, Manchen Wang, Paolo Favaro, Joe Tighe, Davide Modolo

CVPR 2024 Workshop on "What is Next in Multimodal Foundation Models?"

2024

This paper presents novel benchmarks for evaluating vision-language models (VLMs) in zero-shot recognition, focusing on granularity and specificity. Although VLMs ex-cel in tasks like image captioning, they face challenges in open-world settings. Our benchmarks test VLMs’ consistency in understanding concepts across semantic granularity levels and their response to varying text specificity. Findings show

Computer vision
A simple strategy for body estimation from partial-view images

Yafei Mao, Xuelu Li, Brandon Smith, JinJin Li, Raja Bala

CVPR 2024 Workshop on Computer Vision for Fashion, Art, and Design

2024

Virtual try-on and product personalization have become increasingly important in modern online shopping, high-lighting the need for accurate body measurement estimation. Although previous research has advanced in estimating 3D body shapes from RGB images, the task is inherently ambiguous as the observed scale of human subjects in the images depends on two unknown factors: capture distance and body dimensions

Computer vision

The science behind Echo Frames

April 09, 2024

How the team behind Echo Frames delivered longer battery life and improved sound quality inside the slim form factor of a pair of eyeglasses.
Amazon Research Awards issues spring 2024 call for proposals

March 27, 2024

The submission period opens March 27 and closes on May 7.
How Thomas Hoe helps Amazon understand European customers

March 21, 2024

The principal economist and his team address unique challenges using techniques at the intersection of microeconomics, statistics, and machine learning.

Economics

Customer-obsessed science

Conference calendar

Recent publications

News and features

Work with us