Customer-obsessed science


Research areas
-
July 22, 2025Generating diverse synthetic prior distributions leads to a tabular foundation model that outperforms task-specific baselines.
Featured news
-
Multimodal Large Language Models (MllMs) have achieved SOTA performance in various visual language tasks by fusing the visual representations with LLMs lever-aging some visual adapters. In this paper, we first establish that adapters using query-based Transformers such as Q-former is a simplified Multi-instance Learning method with-out considering instance heterogeneity/correlation. We then propose a general
-
2024Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine translation models, such as steerability, inherent document-level translation abilities, and the ability to produce less literal translations. We perform an extensive
-
2024In this paper, we propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision. Our key idea is that, to track a object through frames, we can obtain multiple different association results from a model by varying the frames it can observe, i.e., skipping frames in observation. As the differences in observations do not alter the identities of
-
CoLLAs 20242024Multi-source unsupervised domain adaptation aims to leverage labeled data from multiple source domains for training a machine learning model to generalize well on a target domain without labels. Source domain selection plays a crucial role in determining the model’s performance. It relies on the similarities amongst source and target domains. Nonetheless, existing work for source domain selection often
-
2024We propose a new method to measure the task-specific accuracy of Retrieval-Augmented Large Language Models (RAG). Evaluation is performed by scoring the RAG on an automatically-generated synthetic exam composed of multiple choice questions based on the corpus of documents associated with the task. Our method is an automated, cost-efficient, interpretable, and robust strategy to select the optimal components
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all