Identifying risk for major illnesses such as cancers and heart disease is often predicated on variables such as age, gender, and lifestyle. Adding race and ethnicity as factors is hotly debated among medical professionals. While there are studies suggesting this data improves the accuracy of predictive models, there is also concern these markers increase the stigma faced by already marginalized populations.
In a new report, Harvard Kennedy School public policy professor Sharad Goel, associate public policy professor Soroush Saghafian, HKS PhD candidate Madison Coots, and David Kent of Tufts University provide an important framework in this ongoing debate. HKS asked the researchers about their work.
Q: How did this study come about?
Coots: During my first year as a PhD student, Sharad and I reviewed an interesting paper about assessing diabetes risk that argued for the use of race to better account for risk across different racial ethnic groups. Our project started out trying to replicate that analysis, in part to see if it was possible to get an accurate predictive model for diabetes risk that didn’t include race but maybe included some other factors like social determinants of health, family history of diabetes, or different biomarkers. What we found is that it was really hard to do that.
That led us down this line of inquiry about when it makes sense to include race. How can you evaluate that decision? Does the boost in accuracy from including race actually make a big difference in clinical decision making? We continued pulling on different threads and eventually focused our study on cases of breast cancer, lung cancer, and cardiovascular disease.
“The ‘aha!’ moment for us is that, in terms of prediction, [factoring in a patient’s race] helps. But when it comes to how much benefit you get from it, it is not as large as people think.”
Q: Why is this research important?
Saghafian: There is some debate among physicians and scientists that race is a useful variable to make better predictions, and we find that is true. But we also find there are other factors that say that we should not include race because when you include race there are a lot of societal and ethical aspects involved.
The "aha!" moment for us is that in terms of prediction, it helps. But when it comes to how much benefit you get from it, it is not as large as people think.
You need the prediction to make decisions. When it comes to decision making and how much benefit you get from that decision making that is based on race, the paper shows that it's not that much.
Goel: This is a fraught issue and we're pointing out statistical errors on both sides of the debate. On one hand, some argue that "race" is a social construct so shouldn't be used to make predictions. We agree that race is a messy, social category, but we also find that it adds predictive power. On the other hand, people argue that all information is valuable, so you should consider race. But we find that in practice the marginal value of race is often pretty small. We're likely upsetting people in both camps.
“We hope the medical community explicitly considers the clinical utility of race when deciding whether to include it in a predictive tool.”
Q: What do you hope will happen with this research?
Saghafian: What we are doing is providing additional information to policymakers and health policy decision-makers who are trying to change the screening policies and all the decision-making policies that affect these diseases. For each of those three diseases we highlight, there are guidelines. For example, the CDC has guidelines about who we should screen and who we shouldn't screen, so what do you do after you screen somebody, and you think they are at risk? We don’t have complicit evidence that you should or should not include race, but rather informed data to suggest there are many angles to consider.
Goel: We’re showing that even when race adds predictive value, its clinical utility may be very small. We hope the medical community explicitly considers the clinical utility of race when deciding whether to include it in a predictive tool. In that end, we believe that will improve patient care.
Coots: For other researchers who are publishing work that are introducing a new risk model or advocating for one risk model over another I am curious to see to what extent this framework gets adopted and applied, and also to keep an eye on policy as well.
Sharad and I recently published a follow-up to this paper. In it we provided an overview of recent debates on the fairness of commonly used algorithms in healthcare, identifying common themes and concerns across these debates. We also offered a different perspective on how to weigh algorithm design decisions in the pursuit of more equitable health outcomes.
While we noted that making these policy decisions is no easy task, we hope the discussion helps researchers, clinicians, and policy makers better understand the common threads underlying the debates of using race in estimating disease risk and design more equitable health care algorithms.