Customer-obsessed science


Research areas
-
September 2, 2025Audible's ML algorithms connect users directly to relevant titles, reducing the number of purchase steps for millions of daily users.
-
Featured news
-
ICRA 20242024When a mobile robot autonomously explores an indoor space to produce a localization and navigation map, it is important to create both a stable pose graph and a high-quality occupancy map that covers all the navigable areas. In this work, we propose a novel probabilistic active loop closure framework which attempts to maximally reduce pose graph uncertainty during exploration and improves occupancy map
-
2024The Segment Anything Model (SAM) stands as a foundational framework for image segmentation. While it exhibits remarkable zero-shot generalization in typical scenarios, its advantage diminishes when applied to specialized domains like medical imagery and remote sensing. To address this limitation, this paper introduces Conv-LoRA, a simple yet effective parameter-efficient fine-tuning approach. By integrating
-
EACL 20242024Users of AI-based virtual assistants and search systems encounter challenges in articulating their intents while seeking information on unfamiliar topics, possibly due to complexity of the user’s intent or the lack of meta-information on the topic. We posit that an iterative suggested question-answering (SQA) conversation can improve the trade-off between the satisfaction of the user’s intent while keeping
-
2024Vision-Language (VL) models have gained significant research focus, enabling remarkable advances in multimodal reasoning. These architectures typically comprise a vision encoder, a Large Language Model (LLM), and a projection module that aligns visual features with the LLM’s representation space. Despite their success, a critical limitation persists: the vision encoding process remains decoupled from user
-
2024Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions. We propose a parallel decoding sequence-to-sequence vision-language model, trained with a Query-CTC loss, that marginalizes over multiple inference paths in the decoder. This allows us to model the joint distribution of
Conferences
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all