Customer-obsessed science
Research areas
-
January 13, 20267 min readLeveraging existing environment simulators and reward functions based on verifiable ground truth boosts task success rate, even with small models and small training datasets.
-
December 29, 20256 min read
-
December 29, 20259 min read
-
December 8, 20258 min read
-
December 5, 20256 min read
Featured news
-
2025In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets. However, the large size and high computation demands of LLMs limit their practicality in many applications, especially when further fine-tuning is required. To address these limitations, smaller models are typically preferred for deployment. However, their training is
-
NeurIPS 2025 Workshop on Structured Probabilistic Inference & Generative Modeling2025Large Language Models (LLMs) are increasingly deployed for structured data generation, yet output consistency remains critical for production applications. We introduce a comprehensive framework for evaluating and improving consistency in LLM-generated structured outputs. Our approach combines: (1) STED (Semantic Tree Edit Distance), a novel similarity metric balancing semantic flexibility with structural
-
NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models2025Test-time scaling has emerged as a promising paradigm to enhance reasoning in large reasoning models by allocating additional inference-time compute. However, its potential for tabular reasoning remains underexplored. We identify that existing process reward models, widely used to supervise reasoning steps, struggle with table-specific operations such as table retrieval and schema interaction, leading to
-
2025The efficient implementation of large language models (LLMs) is crucial for deployment on resource-constrained devices. Low-rank tensor compression techniques, such as tensor-train (TT) networks, have been widely studied for over-parameterized neural networks. However, their applications to compress pre-trained large language models (LLMs) for downstream tasks (post-training) remains challenging due to
-
ACM SIGOPS 2025 Workshop on Hot Topics in Operating Systems2025A metastable failure is a self-sustaining congestive collapse in which a system degrades in response to a transient stressor (e.g., a load surge) but fails to recover after the stressor is removed. These rare but potentially catastrophic events are notoriously hard to diagnose and mitigate, sometimes causing prolonged outages affecting millions of users. Ideally, we would discover susceptibility to metastable
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all