Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

II-MMR: Identifying and improving multi-modal multi-hop reasoning in visual question answering

Jihyung Kil, Farideh Tavazoee, Dongyeop Kang, Joo-Kyung Kim

ACL Findings 2024

2024

Visual Question Answering (VQA) often involves diverse reasoning scenarios across Vision and Language (V&L). Most prior VQA studies, however, have merely focused on assessing the model’s overall accuracy without evaluating it on different reasoning cases. Furthermore, some recent works observe that conventional Chain-of-Thought (CoT) prompting fails to generate effective reasoning for VQA, especially for

Computer vision
Impacts of misspelled queries on translation and product search

Greg Hanneman, Natawut Monaikul, Taichi Nakatani

ACL 2024

2024

Machine translation is used in e-commerce to translate second-language queries into the primary language of the store, to be matched by the search system against the product catalog. However, many queries contain spelling mistakes. We first present an analysis of the spelling-robustness of a population of MT systems, quantifying how spelling variations affect MT output, the list of returned products, and

Conversational AI
Large language models as recommender systems: A study of popularity bias

Jan Malte Lichtenberg, Alexander Buchholz, Pola Schwöbel

SIGIR 2024 Workshop on Generative Information Retrieval

2024

The issue of popularity bias—where popular items are disproportionately recommended, overshadowing less popular but potentially relevant items—remains a significant challenge in recommender systems. Recent advancements have seen the integration of general-purpose Large Language Models (LLMs) into the architecture of such systems. This integration raises concerns that it might exacerbate popularity bias,

Conversational AI
Fine-tuned machine translation metrics struggle in unseen domains

Vilém Zouhar, Shuoyang Ding, Anna Currey, Tatyana Badeka, Jenyuan Wang, Brian Thompson

ACL 2024

2024

We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain. We use this dataset to investigate whether machine translation (MT) metrics which are fine-tuned on human-generated MT quality judgements are robust to domain shifts between training and inference. We find that fine-tuned metrics exhibit a substantial performance drop

Conversational AI
Enhancing multimodal large language models with multi-instance visual prompt generator for visual representation enrichment

Wenliang Zhong, Wenyi Wu, Qi Li, Rob Barton, Boxin Du, Shioulin Sam, Karim Bouyarmane, Ismail Tutar, Junzhou Huang

SIGIR 2024 Workshop on Multimodal Representation and Retrieval

2024

Multimodal Large Language Models (MllMs) have achieved SOTA performance in various visual language tasks by fusing the visual representations with LLMs lever-aging some visual adapters. In this paper, we first establish that adapters using query-based Transformers such as Q-former is a simplified Multi-instance Learning method with-out considering instance heterogeneity/correlation. We then propose a general

Computer vision

Image: Getty Images

Bringing the Power of Neural Networks to the Problem of Search

Kai Hui

July 22, 2019

Using machine learning to train information retrieval models — such as Internet search engines — is difficult because it requires so much manually annotated data. Of course, training most machine learning systems requires manually annotated data, but because information retrieval models must handle such a wide variety of queries, they require a lot of data. Consequently, most information retrieval systems rely primarily on mechanisms other than machine learning.

Search and information retrieval
Amazon Mentors Help UMass Graduate Students Make Concrete Advances on Vital Machine Learning Problems

Larry Hardesty

June 27, 2019

Earlier this month, Varun Sharma and Akshit Tyagi, two master’s students from the University of Massachusetts Amherst, began summer internships at Amazon, where, like many other scientists in training, they will be working on Alexa’s spoken-language-understanding systems.

Conversational AI
Active learning: Algorithmically selecting training data to improve Alexa’s natural-language understanding

Stanislav Peshterliev

June 13, 2019

Alexa’s ability to respond to customer requests is largely the result of machine learning models trained on annotated data. The models are fed sample texts such as “Play the Prince song 1999” or “Play River by Joni Mitchell”. In each text, labels are attached to particular words — SongName for “1999” and “River”, for instance, and ArtistName for Prince and Joni Mitchell. By analyzing annotated data, the system learns to classify unannotated data on its own.

Conversational AI
Adapting Alexa to regional language variations

Young-Bum Kim

June 11, 2019

As Alexa expands into new countries, she usually has to be trained on new languages. But sometimes, she has to be re-trained on languages she’s already learned. British English, American English, and Indian English, for instance, are different enough that for each of them, we trained a new machine learning model from scratch.

Conversational AI
Animation by O’Reilly Science Art

Teaching Alexa to follow conversations

Arpit Gupta

June 6, 2019

New approach to reference resolution rewrites queries to clarify ambiguous references.

Conversational AI
Amazon Unveils Novel Alexa Dialog Modeling for Natural, Cross-Skill Conversations

Alexa Science Team

June 5, 2019

Today, customer exchanges with Alexa are generally either one-shot requests, like “Alexa, what’s the weather?”, or interactions that require multiple requests to complete more complex tasks.

Conversational AI

Conversational AI

Publications

Related content

Work with us