-
ACL Findings 20242024Visual Question Answering (VQA) often involves diverse reasoning scenarios across Vision and Language (V&L). Most prior VQA studies, however, have merely focused on assessing the model’s overall accuracy without evaluating it on different reasoning cases. Furthermore, some recent works observe that conventional Chain-of-Thought (CoT) prompting fails to generate effective reasoning for VQA, especially for
-
2024Machine translation is used in e-commerce to translate second-language queries into the primary language of the store, to be matched by the search system against the product catalog. However, many queries contain spelling mistakes. We first present an analysis of the spelling-robustness of a population of MT systems, quantifying how spelling variations affect MT output, the list of returned products, and
-
The issue of popularity bias—where popular items are disproportionately recommended, overshadowing less popular but potentially relevant items—remains a significant challenge in recommender systems. Recent advancements have seen the integration of general-purpose Large Language Models (LLMs) into the architecture of such systems. This integration raises concerns that it might exacerbate popularity bias,
-
2024We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain. We use this dataset to investigate whether machine translation (MT) metrics which are fine-tuned on human-generated MT quality judgements are robust to domain shifts between training and inference. We find that fine-tuned metrics exhibit a substantial performance drop
-
Multimodal Large Language Models (MllMs) have achieved SOTA performance in various visual language tasks by fusing the visual representations with LLMs lever-aging some visual adapters. In this paper, we first establish that adapters using query-based Transformers such as Q-former is a simplified Multi-instance Learning method with-out considering instance heterogeneity/correlation. We then propose a general
Related content
-
July 22, 2019Using machine learning to train information retrieval models — such as Internet search engines — is difficult because it requires so much manually annotated data. Of course, training most machine learning systems requires manually annotated data, but because information retrieval models must handle such a wide variety of queries, they require a lot of data. Consequently, most information retrieval systems rely primarily on mechanisms other than machine learning.
-
June 27, 2019Earlier this month, Varun Sharma and Akshit Tyagi, two master’s students from the University of Massachusetts Amherst, began summer internships at Amazon, where, like many other scientists in training, they will be working on Alexa’s spoken-language-understanding systems.
-
June 13, 2019Alexa’s ability to respond to customer requests is largely the result of machine learning models trained on annotated data. The models are fed sample texts such as “Play the Prince song 1999” or “Play River by Joni Mitchell”. In each text, labels are attached to particular words — SongName for “1999” and “River”, for instance, and ArtistName for Prince and Joni Mitchell. By analyzing annotated data, the system learns to classify unannotated data on its own.
-
June 11, 2019As Alexa expands into new countries, she usually has to be trained on new languages. But sometimes, she has to be re-trained on languages she’s already learned. British English, American English, and Indian English, for instance, are different enough that for each of them, we trained a new machine learning model from scratch.
-
Animation by O’Reilly Science ArtJune 6, 2019New approach to reference resolution rewrites queries to clarify ambiguous references. -
June 5, 2019Today, customer exchanges with Alexa are generally either one-shot requests, like “Alexa, what’s the weather?”, or interactions that require multiple requests to complete more complex tasks.