Customer-obsessed science
Research areas
-
January 13, 20267 min readLeveraging existing environment simulators and reward functions based on verifiable ground truth boosts task success rate, even with small models and small training datasets.
-
December 29, 20256 min read
-
December 29, 20259 min read
-
December 8, 20258 min read
-
December 5, 20256 min read
Featured news
-
NeurIPS 2025 Workshop on Multi-Turn Interactions in Large Language Models2025In real-world task-oriented dialogue (TOD) settings such as customer support for trip booking, banking, and healthcare, agents are required to strictly adhere to complex instructions while conducting multi-turn conversations with customers. These instructions are typically presented in natural language format and include general guidelines and step-by-step procedures with complex constraints. Existing TOD
-
ARR 20252025Language models have demonstrated remarkable capabilities in reasoning tasks through test-time scaling techniques like best-of-N sampling and tree search. However, these approaches often demand substantial computational resources, creating a critical trade-off between performance and efficiency. We introduce STAND (STochastic Adaptive N-gram Drafting), a novel model-free speculative decoding approach that
-
2025Task-Oriented Dialogue (TOD) systems have become increasingly important for real-world applications, yet existing frameworks face significant challenges in handling unstructured information, providing multilingual support, and engaging proactively. We propose SMART (Scalable Multilingual Approach for a Robust TOD System), a novel TOD framework that effectively addresses these limitations. SMART combines
-
CIKM 20252025Search query understanding (QU) is an important building block of the modern e-commerce search engines. QU extracts multiple intents from customer queries, including intended color, brand, etc. One of the most important tasks in QU is predicting which product category the user is interested in. In our work we are tapping into query product type classification (Q2PT) task. Compared to classification of full-fledged
-
2025Data perspectivism goes beyond majority vote label aggregation by recognizing various perspectives as legitimate ground truths. However, current evaluation practices remain fragmented, making it difficult to compare perspectivist approaches and analyze their impact on differ-ent users and demographic subgroups. To ad-dress this gap, we introduce PersEval, the first unified framework for evaluating perspectivist
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all