The unseen work of building reliable AI agents

"Reinforcement learning gyms" train agents on the many low-level tasks that they must chain together to execute customer requests.

The 10 most viewed publications

From foundation model safety frameworks and formal verification at cloud scale to advanced robotics and multimodal AI reasoning, these are the most viewed publications from Amazon scientists and collaborators in 2025.

Read the list

The 10 most viewed blog posts

From quantum computing breakthroughs and foundation models for robotics to the evolution of Amazon Aurora and advances in agentic AI, these are the posts that captured readers' attention in 2025.

Read the list

Customer-obsessed science

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Fine-tuning vision-language models on memory-constrained devices

January 8, 2026

4 min read

A new hybrid optimization approach allows edge devices to fine-tune vision-language models using only forward passes, achieving up to 7% higher accuracy than existing techniques.

Machine learning
Dialogue Boost: How Amazon is using AI to enhance TV and movie dialogue

December 10, 2025

5 min read

Conversational AI
Amazon Nova Forge: "Open training” paradigm that empowers everyone to build their own frontier AI

December 8, 2025

8 min read

Conversational AI
AutoGluon assistant: Zero-code AutoML through multiagent collaboration

December 5, 2025

6 min read

Machine learning
AI-native 6G: From networks to intelligence fabrics

December 1, 2025

8 min read

Cloud and systems

View all

FINAL - making a mind Series Image (16x9).png

New “Making a Mind” podcast explores science of intelligence

Hosted by Dr. Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Spring 2025 ARA recipients

Meet the 63 Amazon Research Award (ARA) recipients, who represent 41 universities in 8 countries.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Multicalibration for confidence scoring in LLMs

Gianluca Detommaso, Martin Bertran Lopez, Riccardo Fogliato, Aaron Roth

ICML 2024

2024

This paper proposes the use of “multicalibration” to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting groupings of the data. We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctness via

Conversational AI
MAML-en-LLM: Model agnostic meta-training of LLMs for improved in-context learning

Sanchit Sinha, Yuguang Yue, Victor Soto, Mayank Kulkarni, Jianhua Lu, Aidong Zhang

KDD 2024

2024

Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches essentially

Conversational AI
Tokenization matters: Navigating data-scarce tokenization for gender inclusive language technologies

Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Yuval Pinter, Rahul Gupta

NAACL 2024

2024

Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM), such as the inability to correctly use gender-diverse English neopronouns (e.g., xe, zir, fae). While data scarcity is a known culprit, the precise mechanisms through which scarcity affects this behavior remain under-explored. We discover LLM misgendering is significantly influenced

Conversational AI
CERET: Cost-effective extrinsic refinement for text generation

Jason Cai, Hang Su, Monica Sunkara, Igor Shalyminov, Saab Mansour

NAACL 2024

2024

Large Language Models (LLMs) are powerful models for generation tasks, but they may not generate good quality outputs in their first attempt. Apart from model fine-tuning, existing approaches to improve prediction accuracy and quality typically involve LLM self-improvement / self-reflection that incorporate feedback from models themselves. Despite their effectiveness, these methods are hindered by their

Conversational AI
SemiGPC: Distribution-aware label refinement for imbalanced semi-supervised learning using gaussian processes

Abdelhak Lemkhenter, Manchen Wang, Luca Zancato, Gurumurthy Swaminathan, Paolo Favaro, Davide Modolo

CVPR 2024 Workshop on Learning with Limited Labelled Data for Image and Video Understanding

2024

In this paper we introduce SemiGPC, a distribution-aware label refinement strategy based on Gaussian Processes where the predictions of the model are derived from the labels posterior distribution. Differently from other buffer-based semi-supervised methods such as Co-Match [17] and SimMatch [34], our SemiGPC includes a normalization term that addresses imbalances in the global data distribution while maintaining

Computer vision

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us