I have a prediction to make: within 5 years the unauthorized sale or otherwise usage of deidentified health data will be illegal. There is legislation in Oregon and Maryland that does just that. Expect these to be the first of many bills to come.

The greatest irony of the status quo is although you may not have access to your health records there are a handful of other people that probably do. Today a vast and murky market exists for the trafficking of your health data.

If health data is deidentified, that is, if there is no reasonable basis to believe it can be used to identify an individual, then that data is not considered protected health information (“PHI”) under the Health Insurance Portability and Accountability Act (“HIPAA”), which is healthcare’s top privacy law. As such, deidentified health data is not subject to the privacy and security rules of HIPAA and can be shared much more freely.

That is how data brokers are able to receive your deidentified health data without you knowing about it. For the most part this steady stream of data comes from health systems, payers, and pharmacies but occasionally EHR vendors participate as well. Data brokers then aggregate this deidentified health information and sell it to third party buyers; for example Adam Tanner of the Harvard Institute for Quantitative Social Science estimates that a large pharmaceutical company might pay between $10 million and $40 million per year for data, consulting and services from Iqvia alone.

To get a sense of the scale of this market consider the following quote, which is lifted directly from Iqvia’s annual 10-K:

We have one of the largest and most comprehensive collections of healthcare information in the world, which includes more than 600 million comprehensive, longitudinal, non-identified patient records spanning sales, prescription and promotional data, medical claims, electronic medical records, genomics, and social media. Our scaled and growing data set contains over 30 petabytes of proprietary data sourced from more than 140,000 data suppliers and covering approximately one million data feeds globally.

And that’s just Iqvia! 

Your health data is being used without you knowing it, and it doesn’t have to pass through a data broker, for that to happen. For example, in the UK Google’s Deepmind received access to 1.6 million citizens’ fully identified medical records through a deal with the Royal Free NHS. Citizens were not asked for their consent, nor were they informed their data was being shared with a third party, commercial entity. And now Deepmind Health will be cannibalized into Google, and while the data they used won’t leave the UK, the algorithms they built using that data will be deployed around the world.

Even venerable institutions are coming under scrutiny for their dealings with patient data, like Sloan Kettering’s sweetheart deal with clear conflicts of interest. Or 23andme’s $300,000,000 deal with drug maker GSK which gave them exclusive rights to their trove of user’s genomic data for 4 years. You can point to countless other examples.

The beginning of the end

We’re at the end of this era. In the past year headline after headline of privacy scandals dominated the news, and high profile public events like the investigation into Russia’s information warfare through social media and Mark Zuckerberg’s testimony to Congress captivated the nation.

We’ve seen that the attitudes of consumers, legislators, and industry leaders towards data ownership and privacy have crossed a critical threshold. Tim Cook, CEO of Apple, has taken up the gauntlet to advocate on behalf of consumers’ privacy, arguing to world leaders and online that we need to end the “shadow economy that’s larger unchecked” of data brokers exploiting our data. Mark Zuckerberg himself penned a long post noted this shift towards privacy and stated that Facebook wants to become a “privacy-focused communications platform.

In other halls of power prominent politicians like California Governor Gavin Newsom are arguing that “California’s consumers should also be able to share in the wealth that is created from their data… because we recognize that data has value and it belongs to you.” 2020 Presidential candidate Elizabeth Warren has advocated for breaking up big tech and explicitly cited a desire to increase pressure on companies to compete and protect users’ privacy. Senator John Kennedy recently introduced a bill empowering users to claim their social media data as property. Vermont passed a bill that required data brokers to register with the state government.

Most of the national discourse thus far has focused on other kinds of data; our search history, shopping habits, what we wittingly or unwittingly disclose to home assistants, messages with friends, that sort of stuff. I suspect that this is because we interact with those system more often and more readily feel their effects. Everyone has a story about a suspiciously well placed ad popping up that provides a salient reminder that we live under ubiquitous corporate surveillance.

When there is a similarly salient story in healthcare people are similarly outraged. After 23andme’s deal with GSK angry consumers looked to delete their genomes or opt out of the deal. Popular fitness and health apps “scrambled” to stop sharing data with Facebook after the Wall Street Journal reported they were sending sensitive data, including their users’ weight and menstrual cycles. It only took a day the Privacy Commissioner of Ontario launched an investigation into potential abuses of privacy in then the wake of a story on the sale of patient data.

In the wake of these stories Stanford and major healthcare industry partners are collaborating on much needed guidelines for the ethical use of digital health technology. This is number 5:

Patients should be able to decide whether their information is shared, and to know how a digital health company uses information to generate revenues.

The idea that patients should be able to decide whether their information is shared isn’t anything new, but it reveals something about our time that knowledge of a company’s data business model is included in guidelines for ethical use of digital health technology.

To see where we’re going it is helpful to look at the past. People intuitively feel that there was something wrong with how Henrietta Lacks was treated and how others profited from her cells without her knowing they took samples. Lacks’ family lived in destitution and struggled with access to healthcare, and yet Lacks’ cells proved to be extremely valuable. Today there are over 10,000 patents involving Henrietta Lack’s cells!

When we hear stories of our health data being monetized without our consent or our knowledge the same feeling that something is morally wrong is tapped into. That our bodies and our health data are not commodities to be mined, but instead an inseparable part of ourselves as individuals with rights that must be respected. Don’t mistake this for as wanting to shy away from our obligations to each other or to science. On the contrary, we overwhelmingly want to participate in research, but whether that is in the form of a clinical trial or feeding our health record to an AI, we want our contributions to recognized and valued. I suspect that if we were to frame those contributions in such a way we would see even more learning from health data.

Despite vast markets and near universal recognition of the importance of health data, the status quo provides patients little control of, recognition for, or value from their health data. There is ample contemporary evidence that our social understanding has already shifted beyond this, and compelling historical precedent to suggest we’ll look back and wonder how we let this system transpire.

It is only a matter of time before businesses and regulatory bodies catch up.