Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Do LLMs recognize your preferences? Evaluating personalized preference following in LLMs

Siyan Zhao, Mingyi Hong, Yang Liu, Devamanyu Hazarika, Kaixiang Lin

ICLR 2025

2025

Large Language Models (LLMs) are increasingly used as chatbots, yet their ability to personalize responses to user preferences remains limited. We introduce PREFEVAL, a benchmark for evaluating LLMs’ ability to infer, memorize and adhere to user preferences in a long-context conversational setting. PREFEVAL comprises 3,000 manually curated user preference and query pairs spanning 20 topics. PREFEVAL contains

Conversational AI
Towards long context hallucination detection

Siyi Liu, Kishaloy Halder, Zheng Qi, Wei Xiao, Nikolaos Pappas, Phu Mon Htut, Neha Anna John, Yassine Benajiba, Dan Roth

NAACL 2025

2025

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. However, they are prone to contextual hallucination, generating information that is either unsubstantiated or contradictory to the given context. Although many studies have investigated contextual hallucinations in LLMs, addressing them in long-context inputs remains an open problem. In this work, we take an initial

Conversational AI
QA-Calibration of language model confidence scores

Atalanti Mastakouri, Elke Kirschbaum, Shiva Kasiviswanathan, Aaditya Ramdas

ICLR 2025

2025

To use generative question-and-answering (QA) systems for decision-making and in any critical application, these systems need to provide well-calibrated confidence scores that reflect the correctness of their answers. Existing calibration methods aim to ensure that the confidence score is on average indicative of the likelihood that the answer is correct. We argue, however, that this standard (average-case

Conversational AI
Unlocking efficient, scalable, and continual knowledge editing with basis-level representation fine-tuning

Tianci Liu, Ruirui Li, Yunzhe Qi, Hui Liu, Xianfeng Tang, Tianqi Zheng, Qingyu Yin, Monica Cheng, Luke Huan, Haoyu Wang, Jing Gao

ICLR 2025

2025

Large language models (LLMs) have achieved remarkable performance on various natural language tasks. However, they are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. This motivates the development of knowledge editing methods designed to update certain knowledge in LLMs without changing unrelated others. To make selective edits, previous efforts often

Conversational AI
Sequence-level large language model training with contrastive preference optimization

Zhili Feng, Dhananjay Ram, Cole Hawkins, Aditya Rawal, Jinman Zhao, Sheng Zha

NAACL Findings 2025

2025

The next token prediction loss is the dominant self-supervised training objective for large language models and has achieved promising results in a variety of downstream tasks. However, upon closer investigation of this objective, we find that it lacks an understanding of sequence-level signals, leading to a mismatch between training and inference processes. To bridge this gap, we introduce a contrastive

Conversational AI

Knowledge distillation for better convergence in multitask learning

Weiyi Lu

July 13, 2022

Allowing separate tasks to converge on their own schedules and using knowledge distillation to maintain performance improves accuracy.

Machine learning
How events like Prime Day helped Amazon navigate the pandemic

Stephen Zorio

July 11, 2022

The SCOT science team used lessons from the past — and improved existing tools — to contend with “a peak that lasted two years”.

Operations research and optimization
NAACL: Industry track offers reality checks, new directions

Larry Hardesty

July 8, 2022

Industry track chair and Amazon principal research scientist Rashmi Gangadharaiah on trends in industry papers and the challenges of building practical dialogue systems.

Conversational AI
Improving “entity linking” between texts and knowledge bases

Tom Ayoola, Joseph Fisher

July 8, 2022

New model sets new standard in accuracy while enabling 60-fold speedups.

Conversational AI
A quick guide to Amazon’s 45-plus NAACL papers

Staff writer

July 7, 2022

The breadth and originality of Amazon’s natural-language-processing research are on display at the annual meeting of the North American chapter of the Association for Computational Linguistics.

Conversational AI
Amazon scientists welcome Iceland’s presidential delegation

Jack G. M. FitzGerald, Nikko Ström

June 29, 2022

President’s visit part of a mission to preserve the Icelandic language in the digital age.

Conversational AI

Conversational AI

Publications

Related content

Work with us