Machine learning

Developing algorithms and statistical models that computer systems use to perform tasks without explicit instructions, relying on patterns and inference instead.

Proposer-agent-evaluator (PAE): Autonomous skill discovery for foundation model internet agents

Yifei Zhou, Qianlan Yang, Kaixiang Lin, Min Bai, Xiong Zhou, Yu-Xiong Wang, Sergey Levine, Erran Li

ICML 2025

2025

A generalist foundation model agent needs to have a large and diverse skill repertoire, such as finding directions between two travel locations and buying specific items from the Internet. If each skill needs to be specified manually through a fixed set of human-annotated instructions, the agent’s skill repertoire will necessarily be limited due to the scalability of human-annotated instructions. In this

Machine learning
HybGrag: Hybrid retrieval-augmented generation on textual and relational knowledge bases

Jeremy Lee, Qi Zhu, Costas Mavromatis, Zhen Han, Soji Adeshina, Vassilis N. Ioannidis, Huzefa Rangwala, Christos Faloutsos

ACL 2025

2025

Given a semi-structured knowledge base (SKB), where text documents are interconnected by relations, how can we effectively retrieve relevant information to answer user questions? Retrieval-Augmented Generation (RAG) retrieves documents to assist large language models (LLMs) in question answering; while Graph RAG (GRAG) uses structured knowledge bases as its knowledge source. However, many questions require

Machine learning
Quantile regression with large language models for price prediction

Nikhita Vedula, Dushyanta Dhyani, Laleh Jalali, Boris Oreshkin, Mohsen Bayati, Shervin Malmasi

ACL 2025

2025

Large Language Models (LLMs) have shown promise in structured prediction tasks, including regression, but existing approaches primarily focus on point estimates and lack systematic comparison across different methods. We investigate probabilistic regression using LLMs for unstructured inputs, addressing challenging text-to-distribution prediction tasks such as price estimation where both nuanced text understanding

Machine learning
AdaRec: Adaptive recommendation with LLMs via narrative profiling and dual-channel reasoning

Kira Wang, Charin Polpanumas

ICML 2025 Workshop on Foundation Models for Structured Data

2025

We propose AdaRec, a few-shot in-context learning framework that leverages Large Language Models (LLMs) for an adaptive personalized recommendation. AdaRec introduces narrative profiling, transforming user-item interactions into natural language representations to enable unified task handling and enhance human readability. Centered on a bivariate reasoning paradigm, AdaRec employs a dual-channel architecture

Machine learning
Cold-start audiobook recommendation via cross-domain sub-tower fusion

Kirandeep Kaur, Amit Goyal

WSDM 2025

2025

For music streaming services expanding into audiobooks, cold-start personalization presents a critical challenge: as audiobooks are a newly introduced content type, the vast majority of existing users have no audiobook listening history. This domain-level cold-start scenario differs from traditional item or user cold-start scenarios, since personalization must begin before any behavioral data exists in

Machine learning

Joint training on speech signal isolation and speech recognition improves performance

Kenichi Kumatani

April 1, 2019

The idea of using arrays of microphones to improve automatic speech recognition (ASR) is decades old. The acoustic signal generated by a sound source reaches multiple microphones with different time delays. This information can be used to create virtual directivity, emphasizing a sound arriving from a direction of interest and diminishing signals coming from other directions. In voice recognition, one of the more popular methods for doing this is known as “beamforming”.

Conversational AI
Adversarial training produces synthetic data for machine learning

Rahul Gupta

March 21, 2019

Sentiment analysis is the attempt, computationally, to determine from someone’s words how he or she feels about something. It has a host of applications, in market research, media analysis, customer service, and product recommendation, among other things. Sentiment classifiers are typically machine learning systems, and any given application of sentiment analysis may suffer from a lack of annotated data for training purposes.

Conversational AI
Machine-labeled data + artificial noise = better speech recognition

Minhua Wu

March 20, 2019

Although deep neural networks have enabled accurate large-vocabulary speech recognition, training them requires thousands of hours of transcribed data, which is time-consuming and expensive to collect. So Amazon scientists have been investigating techniques that will let Alexa learn with minimal human involvement, techniques that fall in the categories of unsupervised and semi-supervised learning.

Conversational AI
To correct imbalances in training data, don’t oversample: Cluster

Ming Sun

March 11, 2019

In experiments involving sound recognition, technique reduces error rate by 15% to 30%.

Machine learning
Innovations from the 2018 Alexa Prize

Behnam Hedayatnia

March 5, 2019

The 2018 Alexa Prize featured eight student teams from four countries, each of which adopted distinctive approaches to some of the central technical questions in conversational AI. We survey those approaches in a paper we released late last year, and the teams themselves go into even greater detail in the papers they submitted to the latest Alexa Prize Proceedings. Here, we touch on just a few of the teams’ innovations.

Conversational AI
AI tools let Alexa Prize participants focus on science

Anushree Venkatesh

February 27, 2019

To ensure that Alexa Prize contestants can concentrate on dialogue systems — the core technology of socialbots — Amazon scientists and engineers built a set of machine learning modules that handle fundamental conversational tasks and a development environment that lets contestants easily mix and match existing modules with those of their own design.

Conversational AI

Machine learning

Recent publications

Related content

Work with us