Customer-obsessed science
Research areas
-
March 20, 202615 min readSimplifying and clarifying the assembly code for core operations enabled automated optimization and verification.
-
March 19, 202611 min read
-
February 25, 202611 min read
-
February 17, 20263 min read
-
Featured news
-
ICLR 2026, NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models2026Process Reward Models (PRMs) have recently emerged as a powerful framework for enhancing the reasoning capabilities of large reasoning models (LRMs), particu-larly in the context of test-time scaling (TTS). However, their potential for supervising LRMs on tabular reasoning domains remains underexplored. Through detailed empirical analyses, we identify that existing PRMs, though widely adopted for supervising
-
AAMAS 20262026Evaluating news recommendation systems (NRS) presents unique challenges due to their dynamic and interactive nature coupled with evolving user interests. In the early stages of development, when user bases and historical data are scarce, it is difficult to conduct meaningful offline and online evaluations. This cold-start evaluation challenge hinders data-driven decision-making for product development and
-
2026Flow-based Generative Models (FGMs) effectively transform noise into complex data distributions. Incorporating Optimal Transport (OT) to couple noise and data during FGM training has been shown to improve the straightness of flow trajectories, enabling more effective inference. However, existing OT-based methods estimate the OT plan using (mini-)batches of sampled noise and data points, which limits their
-
WSDM 20262026Music streaming fraud, where bad actors artificially inflate stream counts to manipulate chart rankings and royalty payments, poses a significant threat to streaming services and legitimate content creators. Traditional fraud detection approaches struggle with a critical challenge: many legitimate edge cases, including super-fans and sleep-music sessions, exhibit activity patterns that closely mimic those
-
2026Despite multilingual pretraining, large language models often struggle with non-English tasks, particularly in language control–the ability to respond in the intended language. We identify and characterize two key failure modes: the multilingual transfer bottleneck (correct language, incorrect task response) and the language consistency bottleneck (correct task response, wrong language). To systematically
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all