Customer-obsessed science
Research areas
-
March 20, 202615 min readSimplifying and clarifying the assembly code for core operations enabled automated optimization and verification.
-
March 19, 202611 min read
-
February 17, 20263 min read
-
-
January 13, 20267 min read
Featured news
-
2025Retrieval Augmented Generation (RAG) has emerged as a powerful application of Large Language Models (LLMs), revolutionizing information search and consumption. RAG systems combine traditional search capabilities with LLMs to generate comprehensive answers to user queries, ideally with accurate citations. However, in our experience of developing a RAG product, LLMs often struggle with source attribution,
-
2025Safety reasoning is a recent paradigm where LLMs reason over safety policies before generating responses, thereby mitigating limitations in existing safety measures such as over-refusal and jailbreak vulnerabilities. However, implementing this paradigm is challenging due to the resource-intensive process of creating high-quality policy-embedded chain-of-thought (CoT) datasets while ensuring reasoning remains
-
ACL Findings 20252025Dense embeddings are fundamental to modern machine learning systems, powering Retrieval Augmented Generation (RAG), information retrieval, and representation learning. While instruction-conditioning has become the dominant approach for embedding specialization, its direct application to low-capacity models imposes fundamental representational constraints that limit the performance gains derived from specialization
-
2025Ambiguous user queries pose a significant challenge in task-oriented dialogue systems relying on information retrieval. While Large Language Models (LLMs) have shown promise in generating clarification questions to tackle query ambiguity, they rely solely on the topk retrieved documents for clarification which fails when ambiguity is too high to retrieve relevant documents in the first place. Traditional
-
2025Text-to-audio generation synthesizes realistic sounds or music given a natural language prompt. Diffusion-based frameworks, including the Tango and the AudioLDM series, represent the state-of-the-art in text-to-audio generation. Despite achieving high audio fidelity, they incur significant inference latency due to the slow diffusion sampling process. MAGNET, a mask-based model operating on discrete tokens
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all