Measurement re-calibration the right and fair way
Measurements of a physical quantity by measuring devices are usually noisy enough that we need to correct, or at least mitigate, the effects of noise. For this purpose, it’s important to distinguish between systematic and random noise since they are of a different nature and independent from each other (when defined properly), so should be dealt with differently. For example, random noise can be significantly mitigated by averaging repeated measurements. . . when possible (and optimally in different controlled conditions). In contrast, systematic noise (or bias) can be corrected using regression with some ground truth (or accurate proxy of ground truth) as independent variable. However, regression is often used in a predictive way with ground truth as a response variable, with the effect of minimizing total errors (random plus systematic). Although this might sound reasonable, it turns out that besides blurring the distinction between random and systematic noise, it achieves nothing for the former and does not completely correct the latter. Moreover, the residual bias might be small for the majority of the population, but usually grows larger further away from the mean. In many cases this yields to unfair outcomes, in particular when the measurement is on human biological quantities. This is especially important when the biological quantities are used to determine if and how much a subject is sick or unhealthy, usually by how much its measurements deviate from the mean. We argue here that re-calibration done with this in mind should be bound to eliminate systematic noise, that random noise should be mitigated by other means (like improved device engineering), and that KPIs1 based on total errors should be revised.