Customer-obsessed science
Research areas
-
December 10, 20255 min readNew audio-processing technology is making entertainment more accessible for millions of viewers.
-
December 8, 20258 min read
-
December 5, 20256 min read
-
-
Featured news
-
EurIPS 20252025Current large language model (LLM) evaluations primarily focus on single-answer tasks, whereas many real-world applications require identifying multiple correct answers. This capability remains under-explored due to the lack of dedicated evaluation frameworks. We introduce SATA-BENCH, a benchmark for evaluating LLMs on Select All That Apply (SATA) questions spanning six domains, including read-ing comprehension
-
ACML 20252025Continuous time-event sequence (CTES) forecasting is essential across diverse domains, from healthcare to finance, requiring accurate prediction of both future event types and their timestamps. Traditionally, CTES forecasting has been driven by Temporal Point Processes (TPPs), which rely on intensity function-based priors. However, these methods often fail to generalize effectively to real-world scenarios
-
NeurIPS 2025 Workshop on Uncovering Causality in Science2025Switchback experiments assign units to treatment and control over time, yielding more precise causal estimates than fixed designs but risking bias from carryover effects, where past treatments influence future outcomes. Existing estimators require specifying an influence period, i.e. an upper bound on carryover duration, often guessed from intuition. We propose a statistical test that detects when this
-
NeurIPS 2025 Workshop on Machine Learning and the Physical Sciences2025Long-horizon motion forecasting for multiple autonomous robots is challenging due to non-linear agent interactions, compounding prediction errors, and continuous-time evolution of dynamics. Learnt dynamics of such a system can be useful in various applications such as travel time prediction, prediction-guided planning and surrogate simulation. In this work, we aim to develop an efficient trajectory forecasting
-
2025We present CEDA, a novel multimodal framework for detecting hallucinations in large language model outputs through a multi-agent debate approach. While existing methods for black-box LLMs often rely on response sampling and self-consistency checking, our framework leverages a three-fold approach: a multi-agent debate setting to critically examine and debate the authenticity of generated content, a lightweight
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all