CausalFusion: Integrating LLMs and graph falsification for causal discovery

Alessandro Casadei; Sreyoshi Bhaduri; Pavan Mullapudi; Ankush Pole; Raj Ratan; Rohit Malshe

Publication

CausalFusion: Integrating LLMs and graph falsification for causal discovery

By Alessandro Casadei, Sreyoshi Bhaduri, Pavan Mullapudi, Ankush Pole, Raj Ratan, Rohit Malshe

2026

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Causal discovery is central to enable causal models for tasks such as effect estimation, counterfactual reasoning, and root cause attribution. Yet existing approaches face trade-offs: purely statistical methods (e.g., PC, LiNGAM) often return structures that overlook domain knowledge, while expert-designed DAGs are difficult to scale and time-consuming to construct. We propose CausalFusion, a hybrid framework that combines graph falsification tests with large language models (LLMs) acting as domain-specialized data scientists. LLMs incorporate domain expertise into candidate structures, while graph falsification tests iteratively refine DAGs to balance statistical validity with expert plausibility. We evaluate CausalFusion through two experiments: (i) a synthetic e-commerce dataset with a precisely defined ground truth DAG, and (ii) real-world supply chain data from Amazon, where the ground truth was constructed with domain experts. To benchmark performance, we compare against classical causal discovery algorithms (PC, LiNGAM) as well as LLM-only baselines that generate DAGs without iterative falsification. Structural Hamming Distance (SHD) is used as the primary evaluation metric to quantify similarity between generated and “true” DAGs. We also analyze different foundational models chain-of-thought traces to examine whether deeper reasoning correlates with improved structural accuracy or reproducibility. Results show that CausalFusion produces DAGs more closely aligned with ground truth than both classical algorithms and LLM-only baselines, while offering interpretable reasoning at each iteration, though challenges in reproducibility and generalizability remain.

CausalFusion: Integrating LLMs and graph falsification for causal discovery

Latest news

Work with us