Amazon Science homepage

How agentic AI helps heal the systems we can’t replace

By learning the idiosyncrasies of accumulated layers of legacy systems, AI agents can preserve institutional knowledge and provide a unified interface to a range of services.

Designing AI agents that know when to step back

As AI agents become more autonomous, the key challenge isn't what they can do; it's how to design the human side of the equation.

How AI is changing the nature of mathematical research

What machine learning theorists learned using AI agents to generate proofs — and what comes next.

Intelligence isn’t about parameter count. It’s about time.

As AI models grow larger, they become less insightful, not more. To ensure that they continue to learn, we need to reduce their inference time.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Formally verified AES-XTS: The first AES algorithm to join s2n-bignum

March 20, 2026

15 min read

Simplifying and clarifying the assembly code for core operations enabled automated optimization and verification.

Automated reasoning
Optimizing LoRA target module selection for efficient fine tuning

March 19, 2026

11 min read

Machine learning
Why a 12-year-old forecasting paper has stood the test of time

February 17, 2026

3 min read

Machine learning
A decade of NFL Next Gen Stats innovation

February 2, 2026

10 min read

Machine learning
Customizing multiturn AI agents with reinforcement learning

January 13, 2026

7 min read

Conversational AI

View all

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is May 6

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Quantifying fairness in LLMs beyond tokens: A semantic and statistical perspective

Weijie Xu, Yiwen Wang, Chi Xue, Xiangkun Hu, Xi Fang, Guimin Dong, Chandan Reddy

COLM 2025

2025

Large Language Models (LLMs) often generate responses with inherent biases, undermining their reliability in real-world applications. Existing evaluation methods often overlook biases in long-form responses and the intrinsic variability of LLM outputs. To address these challenges, we pro-pose FiSCo (Fine-grained Semantic Comparison), a novel statistical frame-work to evaluate group-level fairness in LLMs

Related: Making fairness in LLMs observable, quantifiable, and governable

Machine learning
FalseReject: A resource for improving contextual safety and mitigating over-refusals in LLMs via structured reasoning

Zhehao Zhang, Weijie Xu, Fanyou Wu, Chandan Reddy

COLM 2025

2025

Safety alignment approaches in large language models (LLMs) often lead to the over-refusal of benign queries, significantly diminishing their utility in sensitive scenarios. To address this challenge, we introduce FalseReject, a comprehensive resource containing 16k seemingly toxic queries accompanied by structured responses across 44 safety-related categories. We propose a graph-informed adversarial multi-agent

Related: FalseReject: Reducing overcautiousness in LLMs through reasoning-aware safety evaluation

Conversational AI
Document haystack: A long context multimodal image/document understanding vision LLM benchmark

Goeric Huybrechts, Srikanth Ronanki, Sai Muralidhar Jayanthi, Jack G. M. FitzGerald, Srinivasan Veeravanallur

ICCV 2025

2025

The proliferation of multimodal Large Language Models has significantly advanced the ability to analyze and understand complex data inputs from different modalities. However, the processing of long documents remains under-explored, largely due to a lack of suitable benchmarks. To address this, we introduce Document Haystack12 , a comprehensive benchmark designed to evaluate the performance of Vision Language

Machine learning
GT2Vec: Large language models for knowledge graph augmented text embedding

Jiacheng Lin, Kun Qian, Haoyu Han, Nurendra Choudhary, Tianxin Wei, Zhongruo Wang, Sahika Genc, Edward W Huang, sheng wang, Karthik Subbian, Danai Koutra

KDD 2025

2025

Graph-structured information offers rich contextual information that can enhance language models by providing structured relationships and hierarchies, leading to more expressive embeddings for various applications such as retrieval, question answering, and classification. However, existing methods for integrating graph and text embeddings, often based on Multi-layer Perceptrons (MLPs) or shallow transformers

Search and information retrieval
Using large language models to improve product information in e-commerce catalogs

Gang Luo, Julien Han, Hayreddin Ceker, Karim Bouyarmane

CIKM 2025

2025

To give customers good experience, an e-commerce retailer needs high-quality product information in its catalog. Yet, the raw product information often lacks sufficient quality. For a large catalog that can contain billions of products, manually fixing this information is highly labor-intensive. To address this issue, we propose using the tool use functionality of large language models to automatically

Conversational AI

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us