Bioethics Forum Essay
Making Big Data Inclusive
Big Data, which is derived from a multitude of sources including, social media, “wearables,” electronic health records, and health insurances claims, is increasingly being used in health care and it can potentially improve the way medical professionals diagnose and treat illnesses.
But what happens when Big Data only captures a snapshot of the population, rather than an overall picture of the population as a whole? The sources that generate Big Data – the Internet and credit card use, electronic health records, health insurance claims – are not utilized by everyone. Certain demographics may be missing from or underrepresented in Big Data because they do not own smartphones, have access to the Internet, or visit doctors on a regular basis because they lack health insurance. These sectors of the population disproportionately include low-income individuals, minority groups such as blacks and Hispanics, and the elderly.
It is important to ensure that all members of society are able to contribute to and create their own Big Data trail, if they so desire. I propose some suggestions that could remedy some of the underlying inequities.
For example, federal programs such as Lifeline, which provides phone services to low income individuals, could be expanded or used as a model for similar programs to give more of the population Internet access. In addition, the Food and Drug Administration could use its enforcement discretion to insist that mobile apps comply with efforts to diversify, such as community outreach and re-examining trial designs. Further, the FDA could follow in the footsteps of the National Institutes of Health by requiring diverse data to inform the safety and efficacy of drugs and devices that come from clinical trials.
Another possibility is the implementation of a regulatory scheme to help ensure that data used in health care is inclusive. The Department of Health and Human Services, for example, could propose rules around this topic, such as specifying how to recruit diverse populations. These proposals could eventually become codified in the Code of Federal Regulations.
While these suggestions may not be a magic bullet for making Big Data all-inclusive, they could provide a foundation to start a conversation on how to address the problem and gather more comprehensive, inclusive information on the population as a whole.
Sarah Elizabeth Malanga, JD, MPH, is a regulatory science fellow at the University of Arizona James E. Rogers College of Law. This post is based on a recent presentation given at the Petrie-Flom Center’s 2016 Annual Conference, “Big Data, Health Law, and Bioethics,” at Harvard Law School.