Amazon Science homepage

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

The overthinking problem in AI

Reasoning models can generate seven to 10 times as many tokens as necessary on simple tasks, creating unsustainable costs at scale. Amazon's vision for metacognitive AI could fundamentally shift how models allocate computational resources.

Intelligence isn’t about parameter count. It’s about time.

As AI models grow larger, they become less insightful, not more. To ensure that they continue to learn, we need to reduce their inference time.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Diverse reasoning traces teach LLMs to make better decisions

May 26, 2026

5 min read

How to train language models to generate diverse, accurate reasoning paths using tokens that control distinct reasoning strategies.

Conversational AI
Making LLMs faster without sacrificing accuracy

May 15, 2026

5 min read

Conversational AI
Promptimus: Improving already good LLM prompts with zero manual engineering

May 14, 2026

16 min read
How mechanism design theory helps optimize Amazon-vendor collaboration

May 5, 2026

7 min read

Operations research and optimization
Building trust into AI

May 4, 2026

13 min read

Security, privacy, and abuse prevention

View all

Coming soon: Season 2

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features researchers tackling the hardest problems in agentic AI — from building reliable perception systems to designing training environments that mirror human learning.

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Delta debugging for LLM-integrated systems

Hao-Nan Zhu, Muhammad Numair Mansur, Martin Schaef, Zeya Chen, Tancrède Lepoint, Willem Visser

ICSE 2026

2026

Large Language Models (LLMs) are increasingly integrated into software systems as automated decision-making components. These systems rely on instruction prompts written in natural language to encode complex workflows. However, debugging these prompts when LLMs produce undesired outputs remains challenging due to their black-box nature and the impracticality of manually inspecting large, complex inputs.

Automated reasoning
Agentic simulacra for synthetic construction management data generation

Vincil Bishop, Nivedha Balakrishnan, Saeideh Shahrokh Esfahani

CSER 2026

2026

Construction management systems require realistic test data capturing complex stakeholder interactions and temporal dependencies, yet accessing real project data remains challenging due to privacy constraints and proprietary information protection. This research addresses a critical systems engineering challenge by introducing agentic simulacra patterns that leverage multi-agent coordination to generate

Automated reasoning
SELENE: Selective and evidence-weighted LLM debating for efficient and reliable reasoning

Akshay Verma, Swapnil Gupta, Siddharth Pillai, Prateek Sircar, Deepak Gupta

EACL 2026

2026

Multi-Agent Debate (MAD) frameworks improve factual reliability in large language models (LLMs) by allowing agents to critique and refine one another's reasoning. Yet, existing MAD systems are computationally expensive and prone to degradation under prolonged debates due to redundant exchanges and unstable judging. We propose a lightweight, industry-deployable alternative that unifies Selective Debate Initiation

Information and knowledge management
Exectune: Effective steering of black-box LLMs with guide models

Vijay Lingam, Aditya Golatkar, Anwesan Pal, Ben Vo, Narayanan Sadagopan, Alessandro Achille, Jun Huan, Anoop Deoras, Stefano Soatto

ICLR 2026 Workshop on Lifelong Agents

2026

For large language models deployed through black-box APIs, recurring inference costs often dominate one-time training costs, motivating composed agentic systems that amortize expensive reasoning into reusable intermediate representations. We study a broad class of such systems, termed Guide–Core Policies (GCOP), in which a guide model generates a structured strategy that is executed by a black-box core

Machine learning
ViLL-E: Video LLM embeddings for retrieval

Rohit Gupta, Jayakrishnan Unnikrishnan, Fan Fei, Sheng Liu

ACL 2026

2026

Video Large Language Models (VideoLLMs) excel at video understanding tasks where outputs are textual, such as Video Question Answering and Video Captioning. However, they underperform specialized embedding-based models in Retrieval tasks, such as Text-to-Video Retrieval and Moment Retrieval. We introduce ViLL-E (Video-LLM-Embed), a unified VideoLLM architecture endowed with a novel embedding generation

Computer vision

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us