HKS Authors

See citation below for complete author information.

Daniel Paul Professor of the Practice of Government and Technology, HKS and FAS


Forty-eight states in the United States collect statewide inpatient discharge data that include personal health information of each patient’s hospital visit [1]. A 2013 survey found that 33 of those states subsequently sold or otherwise disclosed copies of the data, but only three states de-identified data consistent with the standards established under the Health Information Portability and Accountability Act (HIPAA), the U.S. federal regulation that dictates the rules by which personal health information is shared [2]. While states are not mandated to follow HIPAA when de-identifying their data, many states still use a version of HIPAA to de-identify discharge data. Did the other 30 states put the privacy of personal health data at risk? To answer this question, Latanya Sweeney tested whether Washington State’s hospital data was vulnerable to re-identification. The study showed that Washington State’s inpatient data allowed for the correct matching of 35 of 81 (or 43 percent) individuals identified in the news stories to the anonymized discharge data released by the states local newspaper stories to anonymized hospital visits [3]. After the study, Washington State improved its anonymization standard for publicly available data and added an application process for others to receive more detailed nonpublic discharge data. Despite this successful outcome, many states were not convinced that the same re-identification strategy would be successful on their datasets. One reason was a belief that Washington State was more vulnerable because it shared patient age in months, a practice not followed by many other states. Is this correct? Are other states exempt from this re-identification strategy? To find out, we repeated the approach on statewide health data from 2010 in Maine and 2011 in Vermont using a total of 291 local news stories.


Yoo, Ji Su, Alexandra Thaler, Latanya Sweeney, and Jinyan Zang. "Risks to Patient Privacy: A Re-identification of Patients in Maine and Vermont Statewide Hospital Data." Technology Science (October 2018).