Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Multimodal LLM augmented reasoning for interpretable visual perception analysis

Shravan Chaudhari, Trilokya Akula, Yoon Kim, Tom Blake

CHI 2025

2025

In this paper, we advance the study of AI-augmented reasoning in the context of Human-Computer Interaction (HCI), psychology and cognitive science, focusing on the critical task of visual perception. Specifically, we investigate the applicability of Multimodal Large Language Models (MLLMs) in this domain. To this end, we leverage established principles and explanations from psychology and cognitive science

Conversational AI
R-VLM: Region-aware vision language model for precise GUI grounding

Joonhyung Park, Peng Tang, Sagnik Das, Srikar Appalaraju, Kunwar Yashraj Singh, R. Manmatha, Shabnam Ghadar

ACL 2025, CVPR 2025

2025

Visual agent models for automating human activities on Graphical User Interfaces (GUIs) have emerged as a promising research direction, driven by advances in large Vision Language Models (VLMs). A critical challenge in GUI automation is the precise grounding of interface elements across diverse platforms. Existing vision-only GUI agents directly ground elements from large and cluttered screenshots, requiring

Computer vision
Turbocharging web automation: The impact of compressed history states

Xiyue Zhu, Peng Tang, Haofu Liao, Srikar Appalaraju

ACL 2025

2025

Language models have led to a leap forward in web automation. The current web automation approaches take the current web state, history actions, and language instruction as inputs to predict the next action, overlooking the importance of history states. However, the highly verbose nature of web page states can result in long input sequences and sparse information, hampering the effective utilization of

Conversational AI
MEMERAG: A multilingual end-to-end meta-evaluation benchmark for retrieval augmented generation

Andrea Cruz, Jayasimha Talur, Bruno Charron, Dong Liu, Saab Mansour, Marcello Federico

ACL 2025

2025

Automatic evaluation of retrieval augmented generation (RAG) systems relies on fine grained dimensions like faithfulness and relevance, as judged by expert human annotators. Meta-evaluation benchmarks support the development of automatic evaluators that correlate well with human judgement. However, existing benchmarks predominantly focus on English or use translated data, which fails to capture cultural

Conversational AI
Think clearly: Improving reasoning via redundant token pruning

Daewon Choi, Jimin Lee, Jihoon Tack, Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati

ICML 2025 Workshop on Efficient Systems for Foundation Models, ARR 2025

2025

Recent large language models have shown promising capabilities in long-form reasoning, following structured chains of thought before arriving at a final answer. However, we observe that these reasoning paths tend to include substantial redundancy; analyzing attention patterns reveals that attention scores are widely scattered, particularly incorrect answers exhibit greater attention sparsity. In this paper

Conversational AI

Credit: Shirin Saleem

How Alexa's new Live Translation for conversations works

Shirin Saleem, Roland Maas

December 14, 2020

Parallel speech recognizers, language ID, and translation models geared to conversational speech are among the modifications that make Live Translation possible.

Conversational AI
Amazon Alexa scientists Yang Liu and Ruhi Sarikaya named IEEE Fellows

Staff writer

December 3, 2020

Scientists are recognized for their contributions to conversational understanding systems.

Conversational AI
Credit: Glynis Condon

A version of the BERT language model that’s 20 times as fast

Adrian de Wynter

December 3, 2020

Determining the optimal architectural parameters reduces network size by 84% while improving performance on natural-language-understanding tasks.

Machine learning
Credit: Glynis Condon

Mitigating social bias in knowledge graph embeddings

Joseph Fisher

November 25, 2020

Method significantly reduces bias while maintaining comparable performance on machine learning tasks.

Information and knowledge management
Alexa & Friends features Ruhi Sarikaya, Alexa AI director of applied science

Staff writer

November 24, 2020

Newly named IEEE Fellow discusses his experience in the field of conversational AI, and the ways he and his team are working to make Alexa more intelligent.

Conversational AI
Credit: from "Joint turn and dialogue level user satisfaction estimation on multi-domain conversations"

Automatically assessing conversations with Alexa

Aditya Tiwari, Josep Valls-Vargas

November 24, 2020

Model for estimating customer satisfaction with interactions that span multiple domains improves on predecessors by 27%.

Conversational AI

Conversational AI

Publications

Related content

Work with us