Customer-obsessed science
Research areas
-
June 8, 20267 min readFour approaches can dramatically improve the performance and trustworthiness of AI agents in operational environments.
-
-
-
-
May 27, 20264 min readMachine learning
Featured news
-
ICASSP 20262026Streaming automatic speech recognition (ASR) systems based on Large Language Models (LLMs) face a fundamental trade-off between accuracy and latency. Existing approaches typically employ fixed-size chunking to maintain low latency, which often compromises recognition accuracy. We propose SCALE, a streaming ASR framework that addresses this challenge through three key techniques: (a) dynamic chunk boundary
-
AISTATS 20262026A/B tests in online experiments face statistical power challenges when testing multiple candidates simultaneously, while adaptive experimental designs (AED) alone fall short in inferring experiment statistics such as the average treatment effect, especially with many metrics (e.g., revenue, safety) and heterogeneous variances. This paper proposes a fixed-budget multi-metric AED framework with a two-phase
-
2026Large Language Model (LLM)-based Multi-Agent Systems (MAS) enable complex problem-solving but introduce significant debugging challenges, characterized by long interaction traces, inter-agent dependencies, and delayed error manifestation. Existing diagnostic approaches often rely on expensive expert annotation or 'LLM-as-a-judge' paradigms, which struggle to pinpoint decisive error steps within extended
-
2026In this paper, we investigate the problem of how to effectively master tool-use to solve complex visual reasoning tasks for Multimodal Large Language Models. To achieve that, we propose a novel Tool-supervised Reinforcement Learning (ToolsRL) framework, with direct tool supervision for more effective tool-use learning. We focus on a series of simple, native, and interpretable visual tools, including zoom-in
-
CVPR 2026 Workshop on TRUE-V2026Vision Language Models (VLMs) are increasingly adopted for document understanding tasks, often replacing traditional OCR systems. However, VLMs exhibit a fundamental difference: they frequently correct or rewrite imperfect text rather than transcribe it literally, a behavior that remains largely underexplored. We present a systematic investigation through controlled experiments with intentionally perturbed
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all