Skip to main content

Big Data in Health Care: Building an Ethics Framework for Biomedical Data Modeling

Principal investigator: Diane M. Korngiebel
Funder: National Institutes of Health/National Human Genome Research Institute; National Institutes of Health/Office of the Director

Big Data in health care is growing, and it is coming from an increasing number of sources, including electronic health records, patient monitors and physical activity trackers, and smartphone applications. This data collection is used to create health-related models that estimate people’s risk of diseases and could influence treatment decisions. But how accurate are the data? How biased? How can errors affect patients? Could a data model make inaccurate predictions for nonwhites or rural populations because it was trained on an incomplete data set that did not equitably represent such groups based on age, socioeconomic status, race, or ethnicity considerations? Lacking representativeness, could it miss key health indicators, such as who is most at risk of developing Type-2 diabetes or who might be at higher risk for readmission after being released from the hospital?

Health data models have the potential to exacerbate current health and health care disparities and create new ones. However, there is currently no framework that biomedical data scientists can use to ensure they document key decisions and consider the potential ethical and societal consequences of those decisions during model development rather than after the fact. This project will develop an ethics framework for scientists to use to help anticipate and prevent these problems.