Amazon Science homepage

How agentic AI helps heal the systems we can’t replace

By learning the idiosyncrasies of accumulated layers of legacy systems, AI agents can preserve institutional knowledge and provide a unified interface to a range of services.

Designing AI agents that know when to step back

As AI agents become more autonomous, the key challenge isn't what they can do; it's how to design the human side of the equation.

How AI is changing the nature of mathematical research

What machine learning theorists learned using AI agents to generate proofs — and what comes next.

Intelligence isn’t about parameter count. It’s about time.

As AI models grow larger, they become less insightful, not more. To ensure that they continue to learn, we need to reduce their inference time.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Formally verified AES-XTS: The first AES algorithm to join s2n-bignum

March 20, 2026

15 min read

Simplifying and clarifying the assembly code for core operations enabled automated optimization and verification.

Automated reasoning
Optimizing LoRA target module selection for efficient fine tuning

March 19, 2026

11 min read

Machine learning
Why a 12-year-old forecasting paper has stood the test of time

February 17, 2026

3 min read

Machine learning
A decade of NFL Next Gen Stats innovation

February 2, 2026

10 min read

Machine learning
Customizing multiturn AI agents with reinforcement learning

January 13, 2026

7 min read

Conversational AI

View all

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is May 6

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

TN-Eval: Rubric and evaluation protocols for measuring the quality of behavioral therapy notes

Raj Shah, Lei Xu, Flora Liu, Jon Burnsky, Drew Bertagnolli, Chaitanya Shivade

ACL 2025

2025

Behavioral therapy notes are important for both legal compliance and patient care. Unlike progress notes in physical health, quality standards for behavioral therapy notes remain underdeveloped. To address this gap, we collaborated with licensed therapists to design a comprehensive rubric for evaluating therapy notes across key dimensions: completeness, conciseness, and faithfulness. Further, we extend

Conversational AI
Kaputt: A large-scale dataset for visual defect detection

Sebastian Hoefer, Dorian Henning, Artemij Amiranashvili, Doug Morrison, Mariliza Tzes, Ingmar Posner, Marc Matvienko, Alessandro Rennola, Anton Milan

ICCV 2025

2025

We present a novel large-scale dataset for defect detection in a logistics setting. Recent work on industrial anomaly detection has primarily focused on manufacturing scenarios with highly controlled poses and a limited number of object categories. Existing benchmarks like MVTec-AD [6] and VisA [33] have reached saturation, with state-of-the-art methods achieving up to 99.9% AUROC scores. In contrast to

Related: Novel “Kaputt” dataset sets new benchmark for large-scale visual defect detection

Computer vision
DFLOW: Diverse dialogue flow simulation with large language models

Wanyu Du, Song Feng, James Gung, Justin Sun, Yi Zhang, Saab Mansour, Yanjun (Jane) Qi

ACL 2025 Workshop on Research on Agent Language Models

2025

Developing language model-based dialogue agents requires effective data to train models that can follow specific task logic. However, most existing data simulation methods focus on increasing diversity in language, topics, or dialogue acts at the utterance level, largely neglecting a critical aspect of task logic diversity at the dialogue level. This paper proposes a novel data simulation method designed

Conversational AI
Human-aligned long-form evaluation (HALF-Eval): Framework for assessing AI-generated content and improvement

Sulbha Jain

KDD 2025 Workshop on LLM4ECommerce

2025

Evaluating long-form AI-generated content remains challenging due to the lack of standardized methodologies that robustly align with human judgment across formats such as articles, blogs, and essays. We introduce HALF-Eval, a scalable framework that combines structured, checklist-based evaluation with machine learning aggregation to assess key quality dimensions, including creativity, impact, coherence

Conversational AI
JokeEval: Are the jokes funny? Review of computational evaluation techniques to improve joke generation

Sulbha Jain

KDD 2025 Workshop on Evaluation and Trustworthiness of Agentic and Generative AI Models

2025

Humor is a complex yet essential aspect of human communication. It can be defined as a communicative expression establishing surprising, incongruent relationships or meanings to amuse. This paper presents empirical evidence demonstrating the successful application of computational methods to humor recognition in AI generated textual data, specifically jokes. Through experiments on synthetic and open-source

Conversational AI

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us