Customer-obsessed science
Research areas
-
January 13, 20267 min readLeveraging existing environment simulators and reward functions based on verifiable ground truth boosts task success rate, even with small models and small training datasets.
-
December 29, 20256 min read
-
December 29, 20259 min read
-
December 8, 20258 min read
-
December 5, 20256 min read
Featured news
-
CIKM 20252025Relevance in e-commerce product search is critical to ensuring that results accurately reflect customer intent. While large language models (LLMs) have recently advanced natural language processing capabilities, their high inference latency and significant infrastructure demands make them less suitable for real-time e-commerce applications. Consequently, transformer-based encoder models are widely adopted
-
NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle2025Rigorous evaluation of Large Language Models (LLMs) is critical for their adoption in high-stakes applications, particularly in highly technical domains that require deep expertise and specialized training. The proliferation of LLMs from vari2025ous providers further underscores the need for comprehensive model performance benchmarking. Like many standardized tests and certification exams, several prominent
-
NeurIPS 2025 Workshop on Efficient Reasoning2025We introduce PHLoRA2 (Post-hoc LoRA), a simple yet powerful method to extract low-rank adaptation adapters from full-rank fine-tuned models without requiring access to training data or gradients. By computing the low-rank decomposition of weight differences between a base model and its fine-tuned counterpart, our method reconstructs adapter modules that can be merged or dynamically routed at inference time
-
NeurIPS 2025 Workshop on ResponsibleFM2025Given the constant flux in the world of geopolitics, staying up to date and compliant with international trade issues is challenging. But exploring if LLMs can aid this task is a frontier hither to unexplored in the LLM evaluation literature - primarily due to the lack of a dataset set for benchmarking the capabilities of LLMs on questions regarding international trade subjects. To address this gap, we
-
Transactions of Machine Learning Research2025Despite fast progress, efficiently training large language models (LLMs) in extremely long contexts remains challenging. Existing methods fall back to training LLMs with short contexts (up to a few thousand tokens) and use inference time techniques when evaluating on very long contexts (above 1M tokens). Training on very long contexts is limited by GPU memory availability and the prohibitively long training
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all