Customer-obsessed science
Research areas
-
May 14, 202616 min readBy focusing on specific failure points and suggesting targeted solutions, a new automated prompt-engineering framework improves prompt performance without compromising existing functionality.
-
-
April 15, 20268 min read
-
April 7, 202613 min read
Featured news
-
ICLR 2026 Workshop on Time Series in the Age of Large Models2026Changepoint detection algorithms identify where structural breaks occur but are conventionally used under a one-to-one mapping between detected breaks and real-world events. We show this mapping assumption is undermined by a fundamental ambiguity: the confidence interval for a detected break widens as the slope jump shrinks, so a wide interval may indicate either a mild genuine break or an approximation
-
CLeaR 20262026In this paper we show how to exploit interventional data to acquire the joint conditional distribution of all the variables using the Maximum Entropy principle. To this end, we extend the Causal Maximum Entropy method to make use of data arising from identifiable interventional distributions in addition to data from the observational distribution. Using Lagrange duality, we prove that the solution to the
-
ICSE 20262026Large Language Models (LLMs) are increasingly integrated into software systems as automated decision-making components. These systems rely on instruction prompts written in natural language to encode complex workflows. However, debugging these prompts when LLMs produce undesired outputs remains challenging due to their black-box nature and the impracticality of manually inspecting large, complex inputs.
-
CSER 20262026Construction management systems require realistic test data capturing complex stakeholder interactions and temporal dependencies, yet accessing real project data remains challenging due to privacy constraints and proprietary information protection. This research addresses a critical systems engineering challenge by introducing agentic simulacra patterns that leverage multi-agent coordination to generate
-
2026Multi-Agent Debate (MAD) frameworks improve factual reliability in large language models (LLMs) by allowing agents to critique and refine one another's reasoning. Yet, existing MAD systems are computationally expensive and prone to degradation under prolonged debates due to redundant exchanges and unstable judging. We propose a lightweight, industry-deployable alternative that unifies Selective Debate Initiation
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all