Social Complexity and Fairness in Synthetic Medical Data

< Projects

Title

About the project

Medical research is increasingly using big data and powerful computers. But one problem with this is that the powerful computer methods we have now for dealing with big data make it easy to figure out who is who in a data set, which is especially bad for the privacy issues related to sensitive medical data. A solution to this is to use machine learning to generate synthetic data from the raw data, that is, to make a fake data set that still represents important elements of the data, and use that for research purposes.

While this is good in theory, early results from this process indicates that machine learning generated datasets have a tendency to over-represent majority elements and diminish representation of minority elements. When applied to medical data, this would mean that synthetic datasets probably have an over-representation of ‘standard’ patients, i.e. white, middle class, 35-yr old men, despite decades of regulation and research practice that has tried to include other patients and bodies in medical research.

This project, focused around a 2 year postdoc, and as a collaboration between WASP-HS & WASP researchers is going to develop fairness metrics to evaluate the production of synthetic data from a specific medical dataset, with the hypothesis that intersectionality can contribute to better data. Additionally, we will closely examine existing synthetic medical data to see if there are lessons social science can take from it to inform theoretical work about intersectional power dynamics in society.

Duration

Start: 1 January 2023
End: 31 December 2024

Project type

NetX

Keywords

fairness metrics, ML, synthetic data, medical data, intersectionality

Universities and institutes

Linköping University

Chalmers University of Technology

Project members

Ericka Johnson

Professor

Linköping University

Francis Lee

Associate Professor

Chalmers University of Technology

Gabriel Eilertsen

Linköping University

Saghi Hajisharif

Linköping University