Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

A calibrated reflection approach for enhancing confidence estimation in LLMs

Umesh Bodhwani, Yuan Ling, Shujing Dong, Yarong Feng, Hongfei Li, Ayush Goyal

NAACL 2025 Workshop on TrustNLP

2025

A critical challenge in deploying Large Language Models (LLMs) is developing reliable mechanisms to estimate their confidence, enabling systems to determine when to trust model outputs versus seek human intervention. We present a Calibrated Reflection approach for enhancing confidence estimation in LLMs, a framework that combines structured reasoning with distance-aware calibration technique. Our approach

Conversational AI
You only read once (YORO): Learning to internalize database knowledge for text-to-SQL

Hideo Kobayashi, Wuwei Lan, Peng Shi, Shuaichen Chang, Jiang Guo, Henghui Zhu, Zhiguo Wang, Patrick Ng

NAACL 2025

2025

While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL

Conversational AI
MergeME: Model merging techniques for homogeneous and heterogeneous MoEs

Yuhang Zhou, Giannis Karamanolakis, Victor Soto, Anna Rumshisky, Mayank Kulkarni, Furong Huang, Wei Ai, Jianhua Lu

NAACL 2025

2025

The recent success of specialized Large Language Models (LLMs) in domains such as mathematical reasoning and coding has led to growing interest in methods for merging these expert LLMs into a unified Mixture-of-Experts (MoE) model, with the goal of enhancing performance in each domain while retaining effectiveness on general tasks. However, the effective merging of expert models remains an open challenge

Conversational AI
BeyondCorrelation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge

Aparna Elangovan, Lei Xu, Jongwoo Ko, Mahsa Elyasi, Ling Liu, Sravan Bodapati, Dan Roth

ICLR 2025

2025

The effectiveness of automatic evaluation of generative models is typically measured by comparing the labels generated via automation with human labels using correlation metrics. However, metrics like Krippendorff’s α and Randolph’s κ were originally designed to measure the reliability of human labeling, thus make assumptions about typical human labeling behavior, and these assumptions may not be applicable

Conversational AI
SimRAG: Self-improving retrieval-augmented generation for adapting large language models to specialized domains

Ran Xu, Hui Liu, Sreyashi Nag, Zhenwei Dai, Yaochen Xie, Xianfeng Tang, Chen Luo, Laurence (Yang) Li, Joyce C. Ho, Carl Yang, Qi He

NAACL 2025

2025

Retrieval-augmented generation (RAG) enhances the question answering (QA) abilities of large language models (LLMs) by integrating external knowledge. However, adapting general-purpose RAG systems to specialized fields such as science and medicine poses unique challenges due to distribution shifts and limited access to domain-specific data. To tackle this, we propose SimRAG, a self-training approach that

Conversational AI

Alexa Prize SocialBot Grand Challenge 5 winners announced

Alexa Prize team

September 12, 2023

GauchoChat wins $250,000 first place prize in overall competition; Chirpy Cardinal earns $250,000 for first place in scientific innovation category.

Conversational AI
Amazon Bedrock offers access to multiple generative AI models

Staff writer

August 28, 2023

AWS service enables machine learning innovation on a robust foundation.

Conversational AI
Interspeech: Where speech recognition and synthesis converge

Larry Hardesty

August 23, 2023

Senior principal scientist Jasha Droppo on the shared architectures of large language models and spectrum quantization text-to-speech models — and other convergences between the two fields.

Conversational AI
A quick guide to Amazon's papers at Interspeech 2023

Staff writer

August 18, 2023

Speech recognition predominates, but Amazon's research takes in data representation, dialogue management, question answering, and more.

Conversational AI
Repairing interrupted questions makes voice agents more accessible

Angus Addlesee, Marco Damonte

August 16, 2023

Learning to represent truncated sentences with semantic graphs improves models’ ability to infer missing content.

Conversational AI
Amazon intern Qing Guo explores the interface between statistics and machine learning

John Roach

August 15, 2023

Guo's second internship is linked to a fellowship awarded through the Amazon–Virginia Tech Initiative for Efficient and Robust Machine Learning.

Conversational AI

Conversational AI

Publications

Related content

Work with us