Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
-
-
September 2, 20253 min read
-
Featured news
-
2025Vision-Language Models (VLMs) excel in diverse visual tasks but face challenges in document understanding, which requires fine-grained text processing. While typical visual tasks perform well with low-resolution inputs, readingintensive applications demand high-resolution, resulting in significant computational overhead. Using OCR-extracted text in VLM prompts partially addresses this issue but underperforms
-
Visual agent models for automating human activities on Graphical User Interfaces (GUIs) have emerged as a promising research direction, driven by advances in large Vision Language Models (VLMs). A critical challenge in GUI automation is the precise grounding of interface elements across diverse platforms. Existing vision-only GUI agents directly ground elements from large and cluttered screenshots, requiring
-
2025Language models have led to a leap forward in web automation. The current web automation approaches take the current web state, history actions, and language instruction as inputs to predict the next action, overlooking the importance of history states. However, the highly verbose nature of web page states can result in long input sequences and sparse information, hampering the effective utilization of
-
2025Automatic evaluation of retrieval augmented generation (RAG) systems relies on fine grained dimensions like faithfulness and relevance, as judged by expert human annotators. Meta-evaluation benchmarks support the development of automatic evaluators that correlate well with human judgement. However, existing benchmarks predominantly focus on English or use translated data, which fails to capture cultural
-
ICML 2025 Workshop on Efficient Systems for Foundation Models, ARR 20252025Recent large language models have shown promising capabilities in long-form reasoning, following structured chains of thought before arriving at a final answer. However, we observe that these reasoning paths tend to include substantial redundancy; analyzing attention patterns reveals that attention scores are widely scattered, particularly incorrect answers exhibit greater attention sparsity. In this paper
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all