I argue that regularizing terms in standard regression methods not only help against overfit-ting finite data, but sometimes also yield better causal models in the infinite sample regime. I first consider a multi-dimensional variable linearly influencing a target variable with some multi-dimensional unobserved common cause, where the confounding effect can be decreased by keep-ing the penalizing term in Ridge and Lasso regression even in the population limit. Choosing the size of the penalizing term, is however challenging, because cross validation is pointless. Here it is done by first estimating the strength of con-founding via a method proposed earlier, which yielded some reasonable results for simulated and real data.Further, I prove a ‘causal generalization bound’ which states (subject to a particular model of confounding) that the error made by interpret-ing any non-linear regression as causal model can be bounded from above whenever functions are taken from a not too rich class. In other words, the bound guarantees ‘generalization’ from observational to interventional distributions, which is usually not subject of statistical learning theory (and is only possible due to the underlying symmetries of the confounder model).