Machine learning

Developing algorithms and statistical models that computer systems use to perform tasks without explicit instructions, relying on patterns and inference instead.

Training LLMs with MXFP4

Albert Tseng, Tao Yu, Youngsuk Park

AISTATS 2025

2025

Low precision (LP) datatypes such as MXFP4 can accelerate matrix multiplications (GEMMs) and reduce training costs. However, directly using MXFP4 instead of BF16 during training significantly degrades model quality. In this work, we present the first near-lossless training recipe that uses MXFP4 GEMMs, which are 2× faster than FP8 on supported hardware. Our key insight is to compute unbiased gradient estimates

Machine learning
Do contemporary causal inference models capture real-world heterogeneity? Findings from a large-scale benchmark

Haining Yu, Yizhou Sun

ICLR 2025

2025

We present unexpected findings from a large-scale benchmark study evaluating Conditional Average Treatment Effect (CATE) estimation algorithms, i.e., CATE models. By running 16 modern CATE models on 12 datasets and 43,200 sampled variants generated through diverse observational sampling strategies, we find that: (a) 62% of CATE estimates have a higher Mean Squared Error (MSE) than a trivial zero-effect

Machine learning
PCL: Prompt-based continual learning for user modeling in recommender systems

Mingdai Yang, Fan Yang, Yanhui Guo, Shaoyuan Xu, Tianchen Zhou, Yetian Chen, Simone Shao, Jia (Kevin) Liu, Yan Gao

The Web Conference 2025

2025

User modeling in large e-commerce platforms aims to optimize user experiences by incorporating various customer activities. Traditional models targeting a single task often focus on specific business metrics, neglecting the comprehensive user behavior, and thus limiting their effectiveness. To develop more generalized user representations, some existing work adopts Multi-task Learning (MTL) approaches.

Machine learning
Uncertainty-aware fusion: An ensemble framework for mitigating hallucinations in large language models

Prasenjit Dey, Srujana Merugu, Sivaramakrishnan (Siva) Kaveri

The Web Conference 2025

2025

Large Language Models (LLMs) are known to hallucinate and generate non-factual outputs which can undermine user trust. Traditional methods to directly mitigate hallucinations, such as representation editing and contrastive decoding, often require additional training data and involve high implementation complexity. While ensemble-based approaches harness multiple LLMs to tap into the "wisdom of crowds",

Machine learning
Aligning to constraints for data-efficient language model customization

Fei Wang, Chao Shang, Shuai Wang, Sarthak Jain, Qiang Ning, Bonan Min, Yassine Benajiba, Vittorio Castelli, Dan Roth

NAACL 2025

2025

General-purpose language models (LMs) are aligned to diverse user intents, but fall short when it comes to specific applications. While finetuning is the default method for customized alignment, human annotations are often unavailable in various customization scenarios. Based on the observation that one of the main issues of LM customization is constraint adherence, we investigate the feasibility of using

Machine learning

A hyperparameter optimization library for reproducible research

Cédric Archambeau

July 29, 2022

Syne Tune supports multiple backends, single-fidelity and multi-fidelity (early-exit) optimization algorithms, and hyperparameter transfer learning.

Machine learning
Causal inference when treatments are continuous variables

Mohammad Taha Bahadori

July 22, 2022

Combining a cutting-edge causal-inference technique and end-to-end machine learning reduces root-mean-square error by 27% to 38%.

Machine learning
Honorable mention to Amazon researchers for ICML test-of-time award

Staff writer

July 22, 2022

Amazon's Bernhard Schölkopf and Dominik Janzing are first and second authors on "breakthrough 2012 paper".

Machine learning
ICML: Where causality meets machine learning

Larry Hardesty

July 21, 2022

Amazon’s Dominik Janzing on the history and promise of the young field of causal machine learning.

Machine learning
New method identifies the root causes of statistical outliers

Kailash Budhathoki, Patrick Blöbaum

July 19, 2022

Amazon ICML paper proposes information-theoretic measurement of quantitative causal contribution.

Machine learning
Amazon scientists Mike Hicks and René Vidal honored

Staff writer

July 15, 2022

Hicks wins 2022 ACM SIGPLAN Distinguished Service Award for career contributions; Vidal wins IEEE Signal Processing Magazine Best Paper Award.

Automated reasoning

Machine learning

Recent publications

Related content

Work with us