Amazon Science homepage

How flat is replacing fat in AWS data center networks

“Quasi-random” network topologies and new passive optical components called ShuffleBoxes make more-efficient flat networks as practical as traditional “fat-tree” networks.

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

The overthinking problem in AI

Reasoning models can generate seven to 10 times as many tokens as necessary on simple tasks, creating unsustainable costs at scale. Amazon's vision for metacognitive AI could fundamentally shift how models allocate computational resources.

Intelligence isn’t about parameter count. It’s about time.

As AI models grow larger, they become less insightful, not more. To ensure that they continue to learn, we need to reduce their inference time.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Diverse reasoning traces teach LLMs to make better decisions

May 26, 2026

5 min read

How to train language models to generate diverse, accurate reasoning paths using tokens that control distinct reasoning strategies.

Conversational AI
Making LLMs faster without sacrificing accuracy

May 15, 2026

5 min read

Conversational AI
Promptimus: Improving already good LLM prompts with zero manual engineering

May 14, 2026

16 min read
How mechanism design theory helps optimize Amazon-vendor collaboration

May 5, 2026

7 min read

Operations research and optimization
Building trust into AI

May 4, 2026

13 min read

Security, privacy, and abuse prevention

View all

Coming soon: Season 2

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features researchers tackling the hardest problems in agentic AI — from building reliable perception systems to designing training environments that mirror human learning.

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Where did it all go wrong? A hierarchical look into multi-agent error attribution

Adi Banerjee, Anirudh Nair, Tarik Borogovac

NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle

2025

Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in multi-agent interaction traces—whether using all-at-once evaluation, step-by-step analysis, or binary search—fall short when analyzing complex patterns, struggling with both accuracy and

Conversational AI
Efficiently generating correlated sample paths from multi-step time series foundation models

Ethan Baron, Boris Oreshkin, Ruijun Ma, Hanyu Zhang, Kari Torkkola, Michael Mahoney, Andrew Gordon Wilson, Tatiana Konstantinova

NeurIPS 2025 Workshop on Recent Advances in Time Series Foundation Models

2025

Many time series applications require access to multi-step forecast trajectories in the form of sample paths. Recently, time series foundation models have leveraged multi-step lookahead predictions to improve the quality and efficiency of multi-step forecasts. However, these models only predict independent marginal distributions for each time step, rather than a full joint predictive distribution. To generate

Machine learning
Beyond collaborative filtering: Using transformers for personalized music recommendation

Tim Greer, Nicholas Capel, Yannik Stein, Giuseppe Di Benedetto, Emanuele Coviello, Amina Shabbeer

NeurIPS 2025

2025

Music recommendation systems face the dual challenge of capturing both immediate context and long-term preferences in users' listening patterns. We adapt a generalized sequential model architecture for music recommendation, introducing modifications that acknowledge how music preferences combine temporal patterns and stable tastes. By removing causal masking constraints typically used in sequential models

Machine learning
Structuring the unstructured: A multi-agent LLM framework for transforming ambiguous SOPs into code

Sachin Kumar Giroh, Pushpendu Ghosh, Aryan Jain, Harshal Paunikar, Anish Nediyanchath, Aditi Rastogi, Promod Yenigalla

EMNLP 2025

2025

This paper introduces, a three-stage multi agent LLM framework designed to transform unstructured and ambiguous Standard Operating Procedure (SOP) into a structured plan and an executable code template. Unstructured SOPs—common across industries such as finance, retail, and logistics—frequently suffer from ambiguity, missing information, and inconsistency, all of which hinder automation. We address this

Conversational AI
Statistical power calculations revisited: Incorporating beliefs about effect sizes

Melany Gualavisi, Ryan Kessler, Lorenzo Masoero

Code@MIT 2025

2025

In A/B testing, statistical power depends on both the variance of estimated impacts and the distribution of true impacts. A low variance metric can have low power if true impacts on the metric tend to be small, while a high variance metric can have high power if true impacts on the metric tend to be large. Traditional power calculations, however, focus solely on the variance of estimated impacts. They compute

Economics

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us