Machine learning

ICML: Where causality meets machine learning

Amazon’s Dominik Janzing on the history and promise of the young field of causal machine learning.

July 21, 2022

Dominik Janzing, a principal research scientist with Amazon Web Services, is a coauthor on four of Amazon’s 18 papers at this year’s International Conference on Machine Learning (ICML), and all four of those papers, like most of Janzing’s papers, have the word “causal” in the title.

Our 2012 paper ‘On causal and anticausal learning’ just received a Test of Time Honorable Mention at @icmlconf #ICML2022: https://t.co/gc1FZYSOyP. I am really grateful, and would like to use this occasion for some thoughts on causality and machine learning:
— Bernhard Schölkopf (@bschoelkopf) July 20, 2022

At ICML 2022, “On causal and anticausal learning”, a 2012 ICML paper that Janzing wrote together with Amazon VP and distinguished scientist Bernhard Schölkopf and colleagues, received an honorable mention for the conference’s Test of Time award.

“It's still a small fraction of papers that refer to causality,” Janzing says, “but it is increasing. If you look at the long-term trend, it's clearly increasing, and I strongly believe that this trend will continue for a while. My prediction is causality will play an even bigger role than now.”

The burgeoning interest in causality among machine learning researchers grew out of related work in neighboring fields, Janzing explains.

“If one looks at the traditional questions of causality, these were about the causal effect of a certain intervention,” Janzing says. “For instance, there’s a patient; the patient gets a drug or not. What's the influence on the recovery, given that there are further influencing factors, called covariates?” That’s the sense of causality that has been central to experimental design and economics.

“Then there was a different community, the graphical-models community, that already modeled more complex systems,” Janzing continues. “The graphical model on a large number of variables can be used to compute the average effect of one specific variable on another one. But it also has the more general goal of decomposing complex systems into understandable mechanisms. I looked at, for instance, problems of causal discovery — how to infer the graphical model from passive observations. That’s still a very ambitious goal. I am optimistic that for this problem also, progress will come from stronger connections to machine learning.”

Dominik Janzing.jpg — Amazon principal research scientist Dominik Janzing. "Once you make friends with the scary monster causality," Janzing says, "it becomes very helpful.”

Enter machine learning

Sometime around 2010, Janzing says, “it became more apparent that causality matters for a lot of different machine learning problems, because it can make a difference whether one just wants to infer statistical relations or whether one wants to infer the generating process.”

There are several hot topics in machine learning whose relations to causality are currently being explored, Janzing says. These include explainable AI, fairness, and learning data representations that are robust to distribution shifts.
“Does explainable AI entail causal explanations by definition?” Janzing asks. “Are semantically meaningful representations necessarily causal representations? If yes, in what sense?

Definitions

Related content

Explaining changes in real-world data

New method identifies which causal factors contribute most to observed changes in probability distributions.

The problem of understanding what causality means is not just philosophical, Janzing explains. It also has immediate consequences for research.

“Whenever we work on applications, we clearly see there is a concept that needs to be defined,” he says. “My students are sometimes surprised that these concepts don't exist yet, because it sounds so obvious that they should exist. But they don't. Which shows that the field is young.”

For instance, one of Janzing’s papers at ICML, “Causal structure-based root cause analysis of outliers”, presents a method for quantifying the extent to which different root causes contribute to an outcome. But first it presents a formal definition of the root cause of an extreme event — “which we didn’t find anywhere,” Janzing says.

Causal circuits 16x9.png — Janzing and his colleagues' ICML paper "Causal structure-based root cause analysis of outliers" treats noise variables in a causal graph as a "switch" that can be thrown to select a particular causal mechanism.

In the same way that the field’s fundamental concepts still require further definitions — “Mostly on top of the graphical-model framework,” Janzing says — it remains to be seen which mathematical tools will prove most useful for causal analysis. Work on causal machine learning so far has involved statistics, functional analysis (especially kernel methods), linear algebra, Shannon information theory, algorithmic information theory, Fourier analysis, group theory, and game theory.

“If I look at the mathematical methods applied in causal inference, then I would say, nobody knows which mathematical methods will mainly be used in causality in 10 years,” Janzing says. “I don't see any math to be irrelevant for that. So it seems to me that the field is still so open and far from settling already to some specific topics, type of questions, and methods.”

About the Author

Larry Hardesty

Larry Hardesty is the editor of the Amazon Science blog. Previously, he was a senior editor at MIT Technology Review and the computer science writer at the MIT News Office.