Customer-obsessed science


Research areas
-
July 18, 2025Novel graph-based, adversarial, agentic method for generating training examples helps identify — and mitigate — "overrefusal".
Featured news
-
2025Optimal prompt selection is crucial for maximizing large language model (LLM) performance on downstream tasks, especially in black-box settings where models are only accessible via APIs. Black-box prompt selection is challenging due to potentially large, combinatorial search spaces, absence of gradient information, and high evaluation cost of prompts on a validation set. We propose HbBoPs, a novel method
-
2025The rapid development of Large Language Models (LLMs) has led to their widespread adoption across various domains, leveraging vast pre-training knowledge and impressive generalization capabilities. However, these models often inherit biased knowledge, resulting in unfair decisions in sensitive applications. It is challenging to remove this biased knowledge without compromising reasoning abilities due to
-
The predominant approach for training web navigation agents gathers human demonstrations for a set of popular websites and hand-written tasks, but it is becoming clear that human data is an inefficient resource. We develop a pipeline to facilitate internet-scale training for agents without laborious human annotations. In the first stage, an LLM generates tasks for 150k diverse websites. In the next stage
-
2025Effectively selecting data from population subgroups where a model performs poorly is crucial for improving its performance. Traditional methods for identifying these subgroups often rely on sensitive information, raising privacy issues. Additionally, gathering such information at runtime might be impractical. This paper introduces a cost-effective strategy that addresses these concerns. We identify underperforming
-
2025Goal-oriented script planning, or the ability to plan coherent sequences of actions toward specific goals, is commonly used by humans to plan for daily activities. In e-commerce, customers increasingly seek LLM-based assistants to plan for them with a script and recommend products at each step, thereby facilitating convenient and efficient shopping experiences. However, this capability remains under-explored
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all