Customer-obsessed science
Research areas
-
March 20, 202615 min readSimplifying and clarifying the assembly code for core operations enabled automated optimization and verification.
-
March 19, 202611 min read
-
February 25, 202611 min read
-
February 17, 20263 min read
-
Featured news
-
2026Although LLMs have demonstrated improved performance by scaling parallel test-time compute, doing so relies on generating reasoning paths that are both diverse and accurate. For challenging problems, the forking tokens that trigger diverse yet correct reasoning modes are typically deep in the sampling tree. Consequently, common strategies to encourage diversity, such as temperature scaling, encounter a
-
2026While Large Language Models excel at mathematical reasoning with Chain-of-Thought prompting, their ability to perform systematic arithmetic reasoning without natural language scaffolding remains poorly understood. We investigate equation-only supervision, where LLMs map natural language problems directly to symbolic equation sequences without intermediate explanations. This approach separates reasoning
-
ICLR 2026 Workshop on AI for Mechanism Design and Strategic Decision Making2026We investigate machine learning approaches for optimizing real-time staffing decisions in semi-automated warehouse sortation systems. Operational decision-making can be supported at different levels of abstraction, with different tradeoffs. We evaluate two approaches, each in a matching simulation environment. First, we train custom Transformer-based policies using offline reinforcement learning on detailed
-
ICLR 2026 Workshop on AI with Recursive Self-Improvement2026Code generation with large language models often relies on multi-stage human-in-the-loop refinement, which is effective but very costly — particularly in domains such as frontend web development where the solution quality depends on rendered visual output. We present a fully automated critic-in-the-loop framework in which a vision-language model serves as a visual critic that provides structured feedback
-
ICLR 2026, NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models2026Process Reward Models (PRMs) have recently emerged as a powerful framework for enhancing the reasoning capabilities of large reasoning models (LRMs), particu-larly in the context of test-time scaling (TTS). However, their potential for supervising LRMs on tabular reasoning domains remains underexplored. Through detailed empirical analyses, we identify that existing PRMs, though widely adopted for supervising
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all