Customer-obsessed science
Research areas
-
December 5, 20256 min readA multiagent architecture separates data perception, tool knowledge, execution history, and code generation, enabling ML automation that works with messy, real-world inputs.
-
-
-
November 20, 20254 min read
-
Featured news
-
Code@MIT 20252025This paper examines the effectiveness of stratification in experimental design using evidence from multiple large-scale experiments. We analyze data from experiments ranging from approximately 30,000 to 180,000 units across different business contexts. Our results show that pre-stratification and post-stratification achieve virtually identical precision improvements - largest in smaller samples (10% improvement
-
Code@MIT 20252025Determining appropriate experimental duration remains a challenging problem in online experimentation. While experimenters ideally would know in advance how long to run experiments in order to inform confident business decisions, many factors affecting conclusiveness of their results are difficult to predict prior to the experiment. Consequently, experimentation services develop 'in-flight' tools that suggest
-
NeurIPS 2025 Workshop on Efficient Reasoning2025Large reasoning models (LRMs) excel at reasoning tasks but face deployment barriers due to computational constraints, regulatory requirements, and domain-specific knowledge gaps. This work addresses these limitations by developing cost-efficient post-training methods to enhance reasoning capabilities. Using Qwen3-4B as our base model, we investigate variations of efficient Supervised Fine-Tuning (SFT) and
-
IJCNLP-AACL 20252025Dense Retrieval (DR) models have proven to be effective for Document Retrieval and Information Grounding tasks. Usually, these models are trained and optimized for improving the relevance of top-ranked documents for a given query. Previous work has shown that popular DR models are sensitive to the query and document lexicon: small variations of it may lead to a significant difference in the set of retrieved
-
2025Large Language Models (LLMs) have emerged as powerful tools for generating coherent text, understanding context, and performing reasoning tasks. However, they struggle with temporal reasoning, which requires processing time-related information such as event sequencing, durations, and inter-temporal relationships. These capabilities are critical for applications including question answering, scheduling,
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all