Customer-obsessed science


Research areas
-
July 29, 2025New cost-to-serve-software metric that accounts for the full software development lifecycle helps determine which software development innovations provide quantifiable value.
Featured news
-
2025Automatic evaluation of retrieval augmented generation (RAG) systems relies on fine grained dimensions like faithfulness and relevance, as judged by expert human annotators. Meta-evaluation benchmarks support the development of automatic evaluators that correlate well with human judgement. However, existing benchmarks predominantly focus on English or use translated data, which fails to capture cultural
-
Recent large language models have shown promising capabilities in long-form reasoning, following structured chains of thought before arriving at a final answer. However, we observe that these reasoning paths tend to include substantial redundancy; analyzing attention patterns reveals that attention scores are widely scattered, particularly incorrect answers exhibit greater attention sparsity. In this paper
-
Designing intelligent assistants for e-commerce sellers presents significant challenges, primarily due to the abstract nature of seller queries and the complexity of orchestrating multiple internal tools. In-context planning (ICP) has emerged as a promising adaptive problem-solving approach for this setting. However, selecting effective exemplars for ICP remains a difficult problem, largely because of the
-
KDD 2025 Workshop on Prompt Optimization2025Prompt engineering represents a critical bottleneck to harness the full potential of Large Language Models (LLMs) for solving complex tasks, as it requires specialized expertise, significant trial-and-error, and manual intervention. This challenge is particularly pronounced for tasks involving subjective quality assessment, where defining explicit optimization objectives becomes fundamentally problematic
-
Enterprise accounting data is complex, ambiguous, and shaped by evolving systems and regulations. The institutional knowledge needed to reason over the data is sparse, scattered and rarely structurally documented—posing major challenges for LLM agents. We introduce a multi-agent financial research framework that mimics a junior analyst’s onboarding and growth. The Analyst Agent learns proactively from repeated
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all