Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
ICASSP 20262026Recent advances in generative retrieval allow large language models (LLMs) to recommend items by generating their identifiers token by token. This requires each item to be represented by a compact, semantically meaningful sequence of tokens that an LLM can understand. We introduce a method to generate multimodal music token (3MToken) that transforms rich metadata from a music database—including audio, credits
-
2026Reinforcement learning with verifiable rewards has significantly advanced reasoning with large language models (LLMs) in domains such as mathematics and logic. However, verifiable signals provide only coarse-grained or binary correctness feedback. This limitation results in inefficiencies like overly verbose or repetitive reasoning. Existing length-based solutions (e.g., length penalty) compromise accuracy
-
2026Large Language Models (LLMs) can serve as world models to enhance agent decision-making in digital environments by simulating future states and predicting action outcomes, potentially eliminating costly trial-and-error exploration. However, this capability is fundamentally limited by LLMs' tendency to hallucination and their reliance on static training knowledge, which could lead to compounding errors that
-
2026Predictive modeling over relational databases (RDBs) powers applications in various domains, yet remains challenging due to the need to capture both cross-table dependencies and complex feature interactions. Recent Relational Deep Learning (RDL) methods automate feature engineering via message passing, while classical approaches like Deep Feature Synthesis (DFS) rely on predefined non-parametric aggregators
-
BIG.AI@MIT2026Large language models (LLMs) are increasingly deployed in real-world applications such as chatbots, writing assistants, and text summarization tools. As these applications become more central to user-facing tasks, robust evaluation of their performance becomes critical, not only for ensuring quality but also for guiding continuous improvement. Traditional evaluation approaches rely on intrinsic metrics
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all