Customer-obsessed science
Research areas
-
December 5, 20256 min readA multiagent architecture separates data perception, tool knowledge, execution history, and code generation, enabling ML automation that works with messy, real-world inputs.
-
-
-
November 20, 20254 min read
-
Featured news
-
2025We present CEDA, a novel multimodal framework for detecting hallucinations in large language model outputs through a multi-agent debate approach. While existing methods for black-box LLMs often rely on response sampling and self-consistency checking, our framework leverages a three-fold approach: a multi-agent debate setting to critically examine and debate the authenticity of generated content, a lightweight
-
NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle2025Building infrastructure-as-code (IaC) in cloud computing is a critical task, underpinning the reliability, scalability, and security of modern software systems. Despite the remarkable progress of large language models (LLMs) in software engineering – demonstrated across many dedicated benchmarks – their capabilities in developing IaC remain underexplored. Unlike existing IaC benchmarks that predominantly
-
2025Understanding causal relationships among the variables of a system is paramount to explain and control its behavior. For many real-world systems, however, the true causal graph is not readily available and one must resort to predictions made by algorithms or domain experts. Therefore, metrics that quantitatively assess the goodness of a causal graph provide helpful checks before using it in downstream tasks
-
NeurIPS 2025 Workshop on New Perspectives in Graph Machine Learning2025Graph Neural Networks (GNNs) have proven to be highly effective for link and edge prediction across domains ranging from social networks to drug discovery. However, processing extremely large graphs with millions of densely connected nodes poses significant challenges in terms of computational efficiency, learning speed, and memory management. Thus making Graph Foundational Model very computationally expensive
-
CIKM 20252025Relevance in e-commerce product search is critical to ensuring that results accurately reflect customer intent. While large language models (LLMs) have recently advanced natural language processing capabilities, their high inference latency and significant infrastructure demands make them less suitable for real-time e-commerce applications. Consequently, transformer-based encoder models are widely adopted
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all