Customer-obsessed science
Research areas
-
December 5, 20256 min readA multiagent architecture separates data perception, tool knowledge, execution history, and code generation, enabling ML automation that works with messy, real-world inputs.
-
-
-
November 20, 20254 min read
-
October 20, 20254 min read
Featured news
-
NeurIPS 2025 Workshop on Efficient Reasoning2025As Large Language Models (LLMs) continue to evolve, practitioners face increasing options for enhancing inference-time performance without model retraining, including budget tuning and multi-step techniques like self-reflection. While these methods improve output quality, they create complex trade-offs among accuracy, cost, and latency that remain poorly understood across different domains. This paper systematically
-
2025Customer service often relies on human agents, which, while effective, can be costly and slower to scale. Recent advancements in intelligent chatbots, particularly Retrieval-Augmented Generation (RAG) models, have significantly enhanced efficiency by integrating large language models with external knowledge retrieval. However, developing a multi-turn RAG-based chatbot for real-world customer service presents
-
2025Reasoning-enhanced large language models (RLLMs), whether explicitly trained for reasoning or prompted via chain-of-thought (CoT), have achieved state-of-the-art performance on many complex reasoning tasks. However, we uncover a surprising and previously overlooked phenomenon: explicit CoT reasoning can significantly degrade instruction-following accuracy. Evaluating 20+ models on two benchmarks: IFEval
-
NeurIPS 2025 Workshop on AI for Music2025Recent advances in generative retrieval allow large language models (LLMs) to recommend items by generating their identifiers token by token, rather than using nearest-neighbor search over embeddings. This approach requires each item, such as a music track, to be represented by a compact and semantically meaningful token sequence that LLMs can generate. We propose a multimodal music tokenizer (3MToken)
-
NeurIPS 2025 Workshop on Multi-Turn Interactions in Large Language Models2025Agentic tool use has gained traction with the rise of agentic tool calling, yet most existing work overlooks the complexity of multi-turn tool interactions. We introduce OrchDAG, a synthetic data generation pipeline that models tool execution as directed acyclic graphs (DAGs) with controllable complexity. Using this dataset, we benchmark model performance and propose a graph-based reward to enhance RLVR
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all