Customer-obsessed science
Research areas
-
June 8, 20267 min readFour approaches can dramatically improve the performance and trustworthiness of AI agents in operational environments.
-
-
-
-
May 26, 20265 min read
Featured news
-
Transactions on Machine Learning Research2026Large Reasoning Models (LRMs) excel at complex reasoning tasks, but their efficiency is often hampered by overly verbose outputs. Prior steering methods attempt to address this issue by applying a single, global vector to hidden representations—an approach grounded in the restrictive linear representation hypothesis. In this work, we introduce FlowSteer, a nonlinear steering method that goes beyond uniform
-
ICML 2026 Workshop on Foundation Models for Structured Data2026Tabular and relational foundation models have demonstrated strong in-context learning on academic benchmarks, but their behavior on enterprise-scale structured data—marked by multi-relational schemas, extreme sparsity, and cold-start inference requirements—remains understudied. We evaluate two foundation model paradigms on global supply chain compliance risk prediction, a setting that stresses all three
-
ICML 2026 Workshop on Resource-Adaptive Foundation Model Inference (AdaptFM)2026We establish a formal equivalence between the Quantized Johnson–Lindenstrauss (QJL) transform of the TurboQuant KV cache compression scheme and the classical 1-bit compressive sensing (1-bit CS) model of Boufounos and Baraniuk (2008), which lets us import 1-bit CS theory into QJL analysis. From it we derive three new consequences. First, reconstruction guarantees for QJL side-channel estimates in terms
-
Interspeech 20262026Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor competition and varying user expectations. We propose ModeratorLM, a role-playing voice agent that conditions turn-taking behavior on an explicitly assigned role in multi-party settings. The system is built on a speech large language model operating in chunk-wise streaming
-
2026Fine-tuning large language models (LLMs) for downstream tasks typically exhibits a fundamental safety-capability trade-off, where improving task performance degrades safety alignment even on benign datasets. This degradation persists across standard approaches including supervised fine-tuning (SFT) and Reinforcement learning from human feedback (RLHF). While reinforcement learning with verifiable rewards
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all