Search - Amazon Science

Bhargav Dodla

Applied Scientist

Andrei Ivanovic

Data Scientist

Kaan Kaya

Data Scientist

Aviv Melamud

Applied Scientist

SAGE: Semantic ambiguity gate

Nihir Chadderwala, Subrat Das, Chaitanya Vejendla, Tasio Guevara, Somdeb Bhattacharjee, Atul Chaudhari

arXiv

2026

Large language model (LLM) agents deployed in healthcare and life sciences (HCLS) routinely receive queries that are semantically ambiguous—the same terms carry different meanings across clinical, regulatory, pharmacovigilance, data-standards, and research domains. Existing approaches address ambiguity post-hoc through output filtering or retrieval augmentation, but do not quantify it before the model responds

Machine learning

Beyond disjoint tasks: Towards more natural continual learning for vision-language models

Xiang Xu, Yiyang Su, Tianchen Zhao, Zheng Zhang, Zhuowen Tu, Anil Jain, Jon Wu

ICML 2026

2026

Continual learning methods for vision-language models are developed on benchmarks where each new task introduces entirely new domain knowledge. Real-world task sequences are more natural: they routinely share visual concepts, language patterns, and even training samples across stages. However, existing mixture-of-expert methods that assign one expert per task with fixed routing can split similar inputs

Computer vision

The fuel of the future is already here: Why TRISO matters

Katy Huff

June 24, 2026

Millimeter-scale particles of nuclear-reactor fuel are encased in four layers of different materials that act as a “miniature containment system”.

Sustainability

Benchmarking multilingual temporal reasoning in LLMs: The temporal reasoning dataset

Vittorio Mazzia, Sandro Pollastrini, Davide Bernardi, Chiara Rubagotti, Daniele Amberti

IWSDS 2026

2026

Time reasoning is a make-or-break capability for Large Language Models (LLMs) aspiring to act as reliable personal and enterprise assistants. This paper introduces the Temporal Reasoning Dataset (TRD), a programmatically generated multilingual benchmark designed to evaluate temporal reasoning operational capabilities in LLMs across ten languages, with particular focus on basic operations relevant to conversational

Conversational AI

Karamvir Singh

Applied Scientist

Connor McMonigle

Applied Scientist

Kelleher Guerin

Principal, Applied Scientist

Kevin Haghi

Data Scientist

Ramkumar R

Manager, Data Science

Aurora PostgreSQL limitless database: Building a highly scalable OLTP database

Dmitry Arkhangelskiy, Saikiran Avula, Sachit Batra, Jin Cheng, Radwan Deeb, Alexey Gotsman, Upendra Gowda, Haritabh Gupta, Benoit Hudzia, Rishabh Jain, Kaumudi Kaushik, Aravind Kumar Kumar, Sergey Melnik, Saleem Mohideen, Sharique Muhammed, Davor Prugovecki, Sanjay Shanthakumar, Sagar Shedge, Anand Kumar , David Wein

SIGMOD/PODS 2026

2026

We present Aurora Limitless Database, a cloud-native distributed database system that extends Amazon Aurora PostgreSQL with horizontal scaling capabilities while maintaining strong consistency guarantees. The system provides transparent scalability using a router layer for query distribution and a storage layer of PostgreSQL shards, which eliminates the need for application-level sharding. Our key technical

Cloud and systems

Anchored FLoE: A business-guardrailed ensemble framework of foundation and local-trained models for demand forecasting

Haoxian Chen, Jiangwei Wang, Cindy Li, Merve Kayhan Serter, Sue Meng, Melody Zu, Abinaya Ulagappa, Vicky yu, Michael Behrman

ECML-PKDD 2026

2026

Accurate demand forecasting is vital for retail supply chain efficiency, yet a persistent trust-capacity gap limits industrial production to low-capacity interpretable models that fail to capture complex market dynamics. We propose Anchored FLoE, a dual-model framework that bridges this gap by fusing high-capacity deep learning with rigorous business guardrails. The framework integrates: (1) FLoE, an ensemble

Economics

GRAFT: Grounding cold-start nodes via factorized structural alignment

Srinivas Virinchi, Aman Gulati, Gokul Swamy, Anoop S V K K Saladi

ECML-PKDD 2026

2026

Graph Neural Networks (GNNs) break down on zero-degree nodes, as message passing requires neighbors. Without interaction history, unseen entities are sub-optimally embedded, leaving them weakly anchored in the latent space, creating a cold-start bottleneck in retrieval. To address this, we propose GRAFT, a factorized architecture that unifies structural and feature transformations into a shared weight space

Machine learning

T2PO: Uncertainty-guided exploration control for stable multi-turn agentic reinforcement learning

Haixin Wang, Hejie Cui, Chenwei Zhang, Xin Liu, Shuowei Jin, Shijie Geng, Xinyang Zhang, Nasser Zalmout, Zhenyu Shi, Yizhou Sun

ICML 2026

2026

Recent progress in multi-turn reinforcement learning (RL) has significantly improved reasoning LLMs' performances on complex interactive tasks. Despite advances in stabilization techniques such as fine-grained credit assignment and trajectory filtering, instability remains pervasive and often leads to training collapse. We argue that this instability stems from inefficient exploration in multi-turn settings

Conversational AI

USAD 2.0: Scaling representation distillation for universal audio understanding

Heng-Jui Chang, Alexander H. Liu, Saurabhchand Bhati, Mrudula Athi, Anton Ratnarajah, Amit Chhetri, James Glass

Interspeech 2026

2026

Audio encoders are critical to modern audio applications as large language models (LLMs) increasingly rely on a single encoder for diverse inputs. While self-supervised learning (SSL) has yielded strong domain-specific encoders like speech or music experts, multi-domain approaches like USAD and SPEAR remain limited in coverage and evaluation. Recent studies also suggest supervised encoders align better

Machine learning

Adaptive geometry routing for vision–language understanding

Sarthak Srivastava, Kathy Wu

KDD 2026

2026

Vision language models face a fundamental geometry trade-off: Euclidean representations excel at instance-level discrimination, while hyperbolic representations naturally encode semantic hierarchies. Hybrid training is challenging because one geometry may dominate early, leaving the other under-trained failure mode we term geometry dominance. We introduce Adaptive Geometry Routing (AGR), a framework that

Computer vision

Sairam Vaidya Mahadeva Ganapathy

Applied Scientist

Search results

Work with us