Customer-obsessed science
Research areas
-
June 8, 20267 min readFour approaches can dramatically improve the performance and trustworthiness of AI agents in operational environments.
-
-
-
-
May 27, 20264 min readMachine learning
Featured news
-
NeurIPS 2025 Workshop on Uncovering Causality in Science2025Online randomized controlled experiments (A/B tests) measure causal changes in industry. While these experiments use incremental changes to minimize disruption, they often yield statistically insignificant results due to low signal-to-noise ratios. Precision improvement (or reducing standard error) traditionally focuses on trigger observations - where treatment and control outputs differ. Though effective
-
KDD 2025 Workshop on AI Agent for Information Retrieval2025In this paper, we present CACHE-ED, a novel framework for document entity extraction that combines the power of large language models (LLMs) with graph-based document representations, caching mechanisms, and an actor-critic multi-agent architecture. Our approach addresses the inefficiencies and inaccuracies that are common in extracting structured information from documents, particularly in templated formats
-
NeurIPS 2025 Workshop on Mathematical Reasoning and AI2025We present an approach for training language models to interactively prove theorems using the Lean proof assistant. Our approach enables models to propose partial proofs, receive verification feedback, and iteratively refine their proofs. We develop a synthetic data generation pipeline that converts static proof datasets into multi-turn interactive sequences, complete with incremental verification feedback
-
Machine Learning for Healthcare 20252025Large language models demonstrate impressive performance on standardized healthcare benchmarks, yet their deployment readiness for real-world environments remains poorly understood. Current medical benchmarks present idealized scenarios that misrepresent the complexity of actual clinical data. We systematically evaluate LLM robustness by introducing clinician-validated perturbations to MedQA that mirror
-
Winter Simulation Conference 20252025The integration of Computer-Aided Design (CAD) models into discrete event simulation software is a critical requirement for many simulation projects, particularly those involving the movement of people or vehicles where spatial accuracy directly impacts study outcomes. While importing CAD files and configuring simulation elements is essential for system accuracy, this process is typically time-consuming
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all