Amazon Science homepage

Amazon scientists and policy experts discuss how the company’s responsible-AI pipeline embeds safety and values throughout the AI development lifecycle.

How mechanism design theory helps optimize Amazon-vendor collaboration

Agentic mechanism enables Amazon and vendors to optimize supply chain management without disclosing private information.

Preserving the privacy of AI training data

How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.

Navigating uncertainty in Amazon's middle-mile network

Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.

The proof assistant behind the Nitro Isolation Engine

Isabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

Technical deep-dives and perspectives from our scientists.

View all

Making LLMs faster without sacrificing accuracy

May 15, 2026

5 min read

A new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.

Conversational AI
Promptimus: Improving already good LLM prompts with zero manual engineering

May 14, 2026

16 min read
How catastrophic is your LLM?

April 27, 2026

4 min read

Conversational AI
Customized Amazon Nova models improve molecular-property prediction in drug discovery

April 15, 2026

8 min read

Machine learning
How Amazon uses agentic AI for vulnerability detection at global scale

April 8, 2026

6 min read

Security, privacy, and abuse prevention

View all

AWS and Hopkins Engineering announce database for AI/ML antibody design

The Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.

Amazon Research Awards issues Spring CFP

Now open across seven research areas, including Agentic AI and Robotics. Applicants receive unrestricted funds, AWS promotional credits, and training resources. Submission deadline is now May 13.

FINAL - making a mind Series Image (16x9).png

“Making a Mind” podcast explores science of intelligence

Hosted by Danielle Perszyk, cognitive scientist at Amazon's AGI Lab, the podcast features conversations with leading AI researchers about the breakthroughs needed to achieve general intelligence.

2026 Amazon Nova AI Challenge: Trusted Software Agents track

Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.

Amazon launches $68 million AI PhD Fellowship program

Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.

Importance of synthesizing high-quality data for text-to-SQL parsing

Yiqun Hu, Yiyun Zhao, Jiarong Jiang, Wuwei Lan, Henry Zhu, Anuj Chauhan, Alexander Li, Lin Pan, Jun Wang, Chung-Wei Hang, Sheng Zhang, Jiang Guo, Marvin Dong, Joe Lilien, Patrick Ng, Zhiguo Wang, Vittorio Castelli, Bing Xiang

ACL Findings 2023, NeurIPS 2022 Workshop on SyntheticData4ML

2023

There has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed two shortcomings: illogical synthetic SQL queries from independent column sampling and

Conversational AI
Dual cache for long document neural coreference resolution

Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

ACL 2023

2023

Recent works show the effectiveness of cache-based neural coreference resolution models on long documents. These models incrementally process a long document from left to right and extract relations between mentions and entities in a cache, resulting in much lower memory and computation cost compared to computing all mentions in parallel. However, they do not handle cache misses when high-quality entities

Conversational AI
Graph reasoning for question answering with triplet retrieval

Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin

ACL Findings 2023

2023

Answering complex questions often requires reasoning over knowledge graphs (KGs). State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e.g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering. However, this paradigm constrains retrieved knowledge in local subgraphs

Conversational AI
LightToken: A task and model-agnostic lightweight token embedding framework for pre-trained language models

Haoyu Wang, Ruirui Li, Haoming Jiang, Zhengyang Wang, Xianfeng Tang, Bin Bi, Monica Cheng, Bing Yin, Yaqing Wang, Tuo Zhao, Jing Gao

KDD 2023

2023

Pre-trained language models (PLMs) such as BERT, RoBERTa, and DeBERTa have achieved state-of-the-art performance on various downstream tasks. The enormous sizes of PLMs hinder their deployment in resource-constrained scenarios, e.g., on edge and mobile devices. To address this issue, many model compression approaches have been proposed to reduce the number of model parameters. This paper focuses on compressing

Related: Compressing token-embedding matrices for language models

Information and knowledge management
A metric-driven approach to conformer layer pruning for efficient ASR inference

Dhanush Bekal, Karthik Gopalakrishnan, Karel Mundnich, Srikanth Ronanki, Sravan Bodapati, Katrin Kirchhoff

Interspeech 2023

2023

Conformer-based end-to-end automatic speech recognition (ASR) models have gained popularity in recent years due to their exceptional performance at scale. However, there are significant computation, memory and latency costs associated with running inference on such models. With the aim of mitigating these issues, we evaluate the efficacy of pruning Conformer layers while fine-tuning only on 20% of the data

Conversational AI

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

Amazon Nova AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Amazon Scholars

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Collaborations

Work with us