Customer-obsessed science
Research areas
-
April 17, 20266 min readIsabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.
-
April 7, 202613 min read
-
March 20, 202615 min read
-
March 19, 202611 min read
Featured news
-
IEEE Big Data 20252025Enterprise relational databases increasingly contain vast amounts of non-semantic data—IP addresses, product identifiers, encoded keys, and timestamps—that challenge traditional semantic analysis. This paper introduces a novel Character-Level Autoencoder (CAE) approach that automatically identifies and groups semantically identical columns in nonsemantic relational datasets by detecting column similarities
-
IJCNLP-AACL 20252025In recent years, dense retrieval has been the focus of information retrieval (IR) research. While effective, dense retrieval produces uninterpretable dense vectors, and suffers from the drawback of large index size. Learned sparse retrieval (LSR) has emerged as promising alternative, achieving competitive retrieval performance while also being able to leverage the classical inverted index data structure
-
2025Previous AutoML systems have made progress in automating machine learning workflows, but still require significant manual setup and expert knowledge. This paper presents a novel multi-agent system that integrates Large Language Models (LLMs) with external knowledge bases of existing machine learning tools to automate the complete end-to-end solution. To address the limitations of pure LLM solutions, including
-
NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle2025Recent research has demonstrated that debate mechanisms among Large Language Models (LLMs) show remarkable potential for enhancing reasoning capabilities and promoting responsible text generation. However, it remains an open question whether debate strategies can effectively generalize to Multi-Modal Large Language Models (MLLMs). In this paper, we address this challenge by proposing a location-aware debate
-
2025Programming assistants powered by large language models have transformed software development, yet most benchmarks focus narrowly on code generation tasks. Recent efforts like InfiBench and StackEval attempt to address this gap using Stack Overflow data but remain limited to single-turn interactions in isolated contexts, require significant manual curation, and fail to represent complete project environments
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all