Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

InDi: Informative and diverse sampling for dense retrieval

Nachshon Cohen, Hedda Cohen Indelman, Yaron Fairstein, Guy Kushilevitz

ECIR 2024

2024

Negative sample selection has been shown to have a crucial effect on the training procedure of dense retrieval systems. Nevertheless, most existing negative selection methods end by randomly choosing from some pool of samples. This calls for a better sampling solution. We define desired requirements for negative sample selection; the samples chosen should be informative, to advance the learning process,

Conversational AI
Paralinguistics-enhanced large language modeling of spoken dialogue

GUAN-TING LIN, Prashanth Gurunath Shivakumar, Ankur Gandhe, Huck Yang, Yi Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko

ICASSP 2024

2024

Large Language Models (LLMs) have demonstrated superior abilities in tasks such as chatting, reasoning, and question-answering. However, standard LLMs may ignore crucial paralinguistic information, such as sentiment, emotion, and speaking style, which are essential for achieving natural, human-like spoken conversation, especially when such information is conveyed by acoustic cues. We therefore propose Paralinguistics-enhanced

Conversational AI
How robust are LLMs to in-context majority label bias?

Karan Gupta, Sumegh Roychowdhury, Siva Rajesh Kasa, Santhosh Kasa, Anish Bhanushali, Nikhil Pattisapu, Prasanna Srinivasa Murthy, Alok Chandra

AAAI 2024 Workshop on Responsible Language Models

2024

In the In-Context Learning (ICL) setup, various forms of label biases can manifest. One such manifestation is majority label bias, which arises when the distribution of labeled examples in the in-context samples is skewed towards one or more specific classes making Large Language Models (LLMs) more prone to predict those labels. Such discrepancies can arise from various factors, including logistical constraints

Conversational AI
Noise-free audio signal processing in noisy environment: A hardware and algorithm solution

Yarong Feng, Zongyi Liu, Shunyan Luo, Yuan Ling, Shujing Dong, Shuyi Wang, Bruce Ferry

NeurIPS 2023 Workshop on Robustness of Zero/Few-shot Learning in Foundation Models (R0-FoMo)

2024

Dealing with background noise is a challenging task in audio signal processing, negatively impacting algorithm performance and system robustness. In this paper, we propose a simple solution that combines recording hardware modification and algorithm improvement to tackle the challenge. The proposed solution could produce clean and noise-free high-quality audio recording even in noisy recording environment

Conversational AI
MIVC: Multiple instance visual component for visual-language models

Wenyi Wu, Qi Li, Wenliang Zhong, Junzhou Huang

WACV 2024

2024

Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance. However, it’s under-explored how to consolidate entity understanding through a varying number of images and to align it with the pre-trained language models for generative tasks. In this paper, we propose MIVC, a general multiple instance visual component to bridge the gap between various

Related: Vision-language models that can handle multi-image inputs

Computer vision

Public release of fact-checking dataset quickly begins to pay dividends

Larry Hardesty

August 19, 2018

At the annual meeting of the North American chapter of the Association for Computational Linguistics in June, researchers at Amazon and the University of Sheffield released a new dataset that can be used to train machine-learning systems to determine the veracity of factual assertions online. The dataset is called FEVER, for fact extraction and verification.

Search and information retrieval
Shrinking machine learning models for offline use

Grant Strimel

August 18, 2018

"Perfect hashing" is among the techniques that reduce the memory footprints of machine learning models by 94%.

Conversational AI
Automatic transliteration can help Alexa find data across language barriers

Yuval Merhav, Steve Ash

August 8, 2018

New machine-learned multilingual named-entity transliteration system.

Conversational AI
Contextual Clues Can Help Improve Alexa’s Speech Recognizers

Anirudh Raju

July 23, 2018

Automatic speech recognition systems, which convert spoken words into text, are an important component of conversational agents such as Alexa. These systems generally comprise an acoustic model, a pronunciation model, and a statistical language model. The role of the statistical language model is to assign a probability to the next word in a sentence, given the previous ones. For instance, the phrases “Pulitzer Prize” and “pullet surprise” may have very similar acoustic profiles, but statistically, one is far more likely to conclude a question that begins “Alexa, what playwright just won a … ?”

Conversational AI
How Alexa can use song-playback duration to learn customers’ preferences

Bo Xiao

July 16, 2018

To be as useful as possible to customers, Alexa should be able to make educated guesses about the meanings of ambiguous utterances. If, for instance, a customer says, “Alexa, play the song ‘Hello’”, Alexa should be able to infer from the customer’s listening history whether the song requested is the one by Adele or the one by Lionel Richie.

Conversational AI
HypRank: How Alexa determines what skill can best meet a customer’s need

Young-Bum Kim

June 8, 2018

Amazon Alexa currently has more than 40,000 third-party skills, which customers use to get information, perform tasks, play games, and more. To make it easier for customers to find and engage with skills, we are moving toward skill invocation that doesn’t require mentioning a skill by name (as highlighted in a recent post).

Conversational AI

Conversational AI

Publications

Related content

Work with us