Correlation Constraints for Regression Models: Controlling Bias in Brain Age Prediction


Matthias S. Treder, Jonathan P. Shock, Dan J. Stein, Stéfan du Plessis5, Soraya Seedat and Kamen A. Tsvetanov


18 Feb 2021


In neuroimaging, the difference between chronological age and predicted brain age, also known as brain age delta, has been proposed as a pathology marker linked to a range of phenotypes. Brain age delta is estimated using regression, which involves a frequently observed bias due to a negative correlation between chronological age and brain age delta. In brain age prediction models, this correlation can manifest as an overprediction of the age of young brains and an underprediction for elderly ones. We show that this bias can be controlled for by adding correlation constraints to the model training procedure. We develop an analytical solution to this constrained optimization problem for Linear, Ridge, and Kernel Ridge regression. The solution is optimal in the least-squares sense i.e., there is no other model that satisfies the correlation constraints and has a better fit. Analyses on the PAC2019 competition data demonstrate that this approach produces optimal unbiased predictive models with a number of advantages over existing approaches. Finally, we introduce regression toolboxes for Python and MATLAB that implement our algorithm.