Amazon Science homepage

Amazon scientists and policy experts discuss how the company’s responsible-AI pipeline embeds safety and values throughout the AI development lifecycle.

How mechanism design theory helps optimize Amazon-vendor collaboration

Agentic mechanism enables Amazon and vendors to optimize supply chain management without disclosing private information.

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

The proof assistant behind the Nitro Isolation Engine

Isabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Making LLMs faster without sacrificing accuracy

May 15, 2026

5 min read

A new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.

Conversational AI
Promptimus: Improving already good LLM prompts with zero manual engineering

May 14, 2026

16 min read
How catastrophic is your LLM?

April 27, 2026

4 min read

Conversational AI
Customized Amazon Nova models improve molecular-property prediction in drug discovery

April 15, 2026

8 min read

Machine learning
How Amazon uses agentic AI for vulnerability detection at global scale

April 8, 2026

6 min read

Security, privacy, and abuse prevention

View all

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is now May 13.

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Stage: Query execution time prediction in Amazon Redshift

Ziniu Wu, Ryan Marcus, Zhengchun Liu, Parimarjan Negi, Vikram Nathan, Pascal Pfeil, Gaurav Saxena, Mohammad Rahman, Murali Narayanaswamy, Tim Kraska

SIGMOD/PODS 2024

2024

Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As a pioneering cloud data warehouse, Amazon Redshift relies on an accurate execution time prediction for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks on the critical path of query execution, such as admission, scheduling, and

Cloud and systems
Entity disambiguation with extreme multi-label ranking

Jyun-Yu Jiang, Wei-Cheng Chang, Jiong Zhang, Cho-Jui Hsieh, Hsiang-Fu Yu

The Web Conference 2024

2024

Entity disambiguation is one of the most important natural language tasks to identify entities behind ambiguous surface mentions within a knowledge base. Although many recent studies apply deep learning to achieve decent results, they need exhausting pretraining and mediocre recall in the retrieval stage. In this paper, we propose a novel framework, eXtreme Multi-label Ranking for Entity Disambiguation

Search and information retrieval
Slapo: A schedule language for progressive optimization of large deep learning model training

Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang

ASPLOS 2024

2024

Recent years have seen an increase in the development of large deep learning (DL) models, which makes training efficiency crucial. Common practice is struggling with the trade-o between usability and performance. On one hand, DL frameworks such as PyTorch use dynamic graphs to facilitate model developers at a price of sub-optimal model training performance. On the other hand, practitioners propose various

Cloud and systems
DISTMM: Accelerating distributed multimodal model training

Jun Huang, Zhen Zhang, Shuai Zheng, Feng Qin, Yida Wang

NSDI 2024: 21st USENIX Symposium on Networked Systems Design and Implementation

2024

Multimodal model training takes multiple types of inputs to process with differently structured submodules, and aggregates outcomes from the submodules to learn the relationship among various types of inputs, e.g., correlating text to image for text-to-image generation. The differences of submodule architectures as well as their inputs lead to heterogeneity in terms of computation efficiency. Failing to

Computer vision
Mixed-type tabular data synthesis with score-based diffusion in latent space

Hengrui Zhang, Jiani Zhang, Balasubramaniam Srinivasan, Zhengyuan Shen, Xiao Qin, Christos Faloutsos, Huzefa Rangwala, George Karypis

ICLR 2024

2024

Recent advances in tabular data generation have greatly enhanced synthetic data quality. However, extending diffusion models to tabular data is challenging due to the intricately varied distributions and a blend of data types of tabular data. This paper introduces TABSYN, a methodology that synthesizes tabular data by leveraging a diffusion model within a variational autoencoder (VAE) crafted latent space

Machine learning

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us