Amazon Science homepage

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

How mechanism design theory helps optimize Amazon-vendor collaboration

Agentic mechanism enables Amazon and vendors to optimize supply chain management without disclosing private information.

Building trust into AI

Amazon scientists and policy experts discuss how the company’s responsible-AI pipeline embeds safety and values throughout the AI development lifecycle.

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

The proof assistant behind the Nitro Isolation Engine

Isabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

How catastrophic is your LLM?

April 27, 2026

4 min read

A new framework provides a statistical method for estimating the likelihood of catastrophic failures in large language models in adversarial conversations.

Conversational AI
Customized Amazon Nova models improve molecular-property prediction in drug discovery

April 15, 2026

8 min read

Machine learning
How Amazon uses agentic AI for vulnerability detection at global scale

April 8, 2026

6 min read

Security, privacy, and abuse prevention
Verifying and optimizing post-quantum cryptography at Amazon

April 7, 2026

13 min read

Automated reasoning
Improving quality and robustness in LLM-based text-to-speech systems

April 1, 2026

5 min read

Conversational AI

View all

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is now May 13.

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

CACHE-ED: Redefining document entity extraction with graph-based templates, actor-critic agents & HIL

Sudhanshu Bhoi, Harish Y V S

KDD 2025 Workshop on AI Agent for Information Retrieval

2025

In this paper, we present CACHE-ED, a novel framework for document entity extraction that combines the power of large language models (LLMs) with graph-based document representations, caching mechanisms, and an actor-critic multi-agent architecture. Our approach addresses the inefficiencies and inaccuracies that are common in extracting structured information from documents, particularly in templated formats

Machine learning
Understanding the limitations of medical reasoning in large language models

Bill Cai, Xiaogang Wang, Ujjwal Ratan, Yash Shah

Machine Learning for Healthcare 2025

2025

Large language models demonstrate impressive performance on standardized healthcare benchmarks, yet their deployment readiness for real-world environments remains poorly understood. Current medical benchmarks present idealized scenarios that misrepresent the complexity of actual clinical data. We systematically evaluate LLM robustness by introducing clinician-validated perturbations to MedQA that mirror

Conversational AI
Optimizing CAD-simulation integration: An automated framework for model generation

Rebecca Pires dos Santos, GILLES GUEDIA, Abhineet Mittal

Winter Simulation Conference 2025

2025

The integration of Computer-Aided Design (CAD) models into discrete event simulation software is a critical requirement for many simulation projects, particularly those involving the movement of people or vehicles where spatial accuracy directly impacts study outcomes. While importing CAD files and configuring simulation elements is essential for system accuracy, this process is typically time-consuming

Operations research and optimization
Multiple randomization designs: Estimation and inference with interference

Lorenzo Masoero, Suhas Vijaykumar, Thomas S. Richardson, James McQueen, Ido Rosen, Brian Burdick, Pat Bajari, Guido Imbens

Journal of the Royal Statistical Society, Series B

2025

Completely randomized experiments, originally developed by Fisher and Neyman in the 1930s, are still widely used in practice, even in online experimentation. However, such designs are of limited value for answering standard questions in marketplaces, where multiple populations of agents interact strategically, leading to complex patterns of spillover effects. In this paper, we derive the finite-sample properties

Economics
STED and consistency scoring: A framework for evaluating LLM structured output reliability

Gordon Wang, Jinze Yu, Xing Zhang, Dayuan Jiang, Yin Song, Tomal Deb, Xuefeng Liu, Peiyang He

NeurIPS 2025 Workshop on Structured Probabilistic Inference & Generative Modeling

2025

Large Language Models (LLMs) are increasingly deployed for structured data generation, yet output consistency remains critical for production applications. We introduce a comprehensive framework for evaluating and improving consistency in LLM-generated structured outputs. Our approach combines: (1) STED (Semantic Tree Edit Distance), a novel similarity metric balancing semantic flexibility with structural

Machine learning

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us