Amazon Science homepage

Amazon scientists and policy experts discuss how the company’s responsible-AI pipeline embeds safety and values throughout the AI development lifecycle.

How mechanism design theory helps optimize Amazon-vendor collaboration

Agentic mechanism enables Amazon and vendors to optimize supply chain management without disclosing private information.

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

The proof assistant behind the Nitro Isolation Engine

Isabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

How catastrophic is your LLM?

April 27, 2026

4 min read

A new framework provides a statistical method for estimating the likelihood of catastrophic failures in large language models in adversarial conversations.

Conversational AI
Customized Amazon Nova models improve molecular-property prediction in drug discovery

April 15, 2026

8 min read

Machine learning
How Amazon uses agentic AI for vulnerability detection at global scale

April 8, 2026

6 min read

Security, privacy, and abuse prevention
Verifying and optimizing post-quantum cryptography at Amazon

April 7, 2026

13 min read

Automated reasoning
Improving quality and robustness in LLM-based text-to-speech systems

April 1, 2026

5 min read

Conversational AI

View all

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is now May 13.

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Self-supervised multi-object tracking with path consistency

Zijia Lu, Bing Shuai, Yanbei Chen, Zhenlin Xu, Davide Modolo

CVPR 2024

2024

In this paper, we propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision. Our key idea is that, to track a object through frames, we can obtain multiple different association results from a model by varying the frames it can observe, i.e., skipping frames in observation. As the differences in observations do not alter the identities of

Computer vision
Gradual fine-tuning with graph routing for multi-source unsupervised domain adaptation

Yao Ma, Samuel Louvan, Zhunxuan Wang

CoLLAs 2024

2024

Multi-source unsupervised domain adaptation aims to leverage labeled data from multiple source domains for training a machine learning model to generalize well on a target domain without labels. Source domain selection plays a crucial role in determining the model’s performance. It relies on the similarities amongst source and target domains. Nonetheless, existing work for source domain selection often

Machine learning
Automated evaluation of retrieval-augmented language models with task-specific exam generation

Gauthier Guinet, Behrooz Omidvar-Tehrani, Anoop Deoras, Laurent Callot

ICML 2024

2024

We propose a new method to measure the task-specific accuracy of Retrieval-Augmented Large Language Models (RAG). Evaluation is performed by scoring the RAG on an automatically-generated synthetic exam composed of multiple choice questions based on the corpus of documents associated with the task. Our method is an automated, cost-efficient, interpretable, and robust strategy to select the optimal components

Related: Automated evaluation of RAG pipelines with exam generation

Conversational AI
THRONE: An object-based hallucination benchmark for the free-form generations of large vision-language models

Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto

CVPR 2024

2024

Mitigating hallucinations in large vision-language models (LVLMs) remains an open problem. Recent benchmarks do not address hallucinations in open-ended free-form responses, which we term “Type I hallucinations”. Instead, they focus on hallucinations responding to very specific question formats—typically a multiple-choice response regarding a particular object or attribute—which we term “Type II hallucinations

Computer vision
Confidence intervals for error rates in 1:1 matching tasks: Critical statistical analysis and recommendations

Riccardo Fogliato, Pratik Patil, Pietro Perona

International Journal of Computer Vision

2024

Matching algorithms predict relationships between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same per-son. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when test data are dependent and error rates are low, two aspects that have been often over-looked in the literature. In

Machine learning

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us