Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
-
-
September 2, 20253 min read
-
Featured news
-
NeurIPS 2025 Workshop on Recent Advances in Time Series Foundation Models2025Many time series applications require access to multi-step forecast trajectories in the form of sample paths. Recently, time series foundation models have leveraged multi-step lookahead predictions to improve the quality and efficiency of multi-step forecasts. However, these models only predict independent marginal distributions for each time step, rather than a full joint predictive distribution. To generate
-
2025Music recommendation systems face the dual challenge of capturing both immediate context and long-term preferences in users' listening patterns. We adapt a generalized sequential model architecture for music recommendation, introducing modifications that acknowledge how music preferences combine temporal patterns and stable tastes. By removing causal masking constraints typically used in sequential models
-
2025This paper introduces, a three-stage multi agent LLM framework designed to transform unstructured and ambiguous Standard Operating Procedure (SOP) into a structured plan and an executable code template. Unstructured SOPs—common across industries such as finance, retail, and logistics—frequently suffer from ambiguity, missing information, and inconsistency, all of which hinder automation. We address this
-
Code@MIT 20252025In A/B testing, statistical power depends on both the variance of estimated impacts and the distribution of true impacts. A low variance metric can have low power if true impacts on the metric tend to be small, while a high variance metric can have high power if true impacts on the metric tend to be large. Traditional power calculations, however, focus solely on the variance of estimated impacts. They compute
-
NeurIPS 2025 Workshop on Bridging Language, Agent, and World Models (LAW)2025We present a framework for uncovering and exploiting dependencies among tools and documents to enhance exemplar artifact generation. Our method begins by constructing a tool knowledge graph from tool schemas—including descriptions, arguments, and output payloads—using a DeepResearch-inspired analysis. In parallel, we derive a complementary knowledge graph from internal documents and SOPs, which is then
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all