Hastings Center Report
Synthetic Health Data: Real Ethical Promise and Peril
Abstract: Researchers and practitioners are increasingly using machine-generated synthetic data as a tool for advancing health science and practice, by expanding access to health data while—potentially—mitigating privacy and related ethical concerns around data sharing. While using synthetic data in this way holds promise, we argue that it also raises significant ethical, legal, and policy concerns, including persistent privacy and security problems, accuracy and reliability issues, worries about fairness and bias, and new regulatory challenges. The virtue of synthetic data is often understood to be its detachment from the data subjects whose measurement data is used to generate it. However, we argue that addressing the ethical issues synthetic data raises might require bringing data subjects back into the picture, finding ways that researchers and data subjects can be more meaningfully engaged in the construction and evaluation of datasets and in the creation of institutional safeguards that promote responsible use.