Researchers have found that an overhaul of the performance review system for judging young doctors has led to a large reduction in ratings bias against Black, Latino, and Asian medical residents, although bias persists in the evaluations of U.S.-born Black residents.

A study published in December in the Annals of Internal Medicine compared bias trends in the performance evaluations of hospital residents before and after 2014, when a new evaluation system was introduced that reduced reviewer discretion in the process.

The researchers included senior author Marcella Alsan, Angelopoulos Professor of Public Policy at Harvard Kennedy School, who is a physician, healthcare economist, and public health expert. The lead author was Bradley Gray, senior health services researcher at the American Board of Internal Medicine. Co-authors were Rebecca Lipner and Jonathan Vandergrift, also of the ABIM; Robert Roswell of the Zucker School of Medicine at Hofstra Northwell; and Alicia Fernandez of the University of California-San Francisco.

In the U.S. medical education system, doctors who graduate from medical school usually go on to serve as residents in hospitals for two or more years, treating patients and undergoing further training, often in a specialty field. Doctors go through detailed evaluations during their residencies, and Black, Latino and Asian residents have reported experiencing evaluation bias that can limit their career prospects.

The adoption of the Milestone evaluation system in 2014 addressed shortcomings including lack of clear descriptions of performance expectations and overreliance on program directors to evaluate residents. The new system gave more weight to clinical competency, committee ratings, and other more structured measurements, along with guidance on increasing the number and diversity of those doing the subjective evaluations.

Marcella Alsan.

“Importantly, still more needs to be done to improve the situation for U.S.-born Black physicians. And there are too few Black doctors in the United States to begin with.”

Marcella Alsan

The researchers looked at ratings data for nearly 60,000 residents who completed residencies in internal medicine before and after the new rating system was adopted, and they compared the more subjective performance ratings with more objective standardized test results on certification exams and other physician scores.

“I am grateful to the American Board of Internal Medicine for partnering with us on this project,” Alsan said. “It is difficult to obtain objective and subjective testing data alongside demographic characteristics of physician trainees within U.S. medicine. But once such data were made available, we could observe the bias that trainees from all backgrounds have been reporting anecdotally borne out in the data—and that systemic changes to reduce bias indeed work.”

The study found large decreases in ratings bias against minority group residents in the years after adoption of the new ratings system. These groups included U.S.-born and foreign-born Asian and Latino residents and non-U.S.-born Black residents. However, the bias decrease was smaller for U.S.-born Black residents, leaving a substantial and concerning gap.

“Importantly, still more needs to be done to improve the situation for U.S.-born Black physicians,” Alsan added. "And there are too few Black doctors in the United States to begin with.”

The authors note that this research matters in part because these minority groups “report experiencing bias in medical education and are underrepresented in academic leadership positions in medicine. Understanding sources of bias is especially important for Black and Latino physicians because they are vastly underrepresented in medicine.”

Roswell said in a video interview on the Annals of Internal Medicine journal website: “As a U.S.-born Black medical student and internal medicine resident, I witnessed first-hand what I thought was evaluator rater bias, and it was really jarring to me as a medical student, I saw it all around me. The question when you are perceiving potential bias is, ‘is this real, is this in my head?’ And this is something that has really been a goal through my entire career—to find equity in assessment.”

Gray, the lead author, said, “The significant bias we found pre-Milestones both suggests that there exists large underlying evaluation bias against the minoritized groups we studied and provides an explanation as to why these groups are vastly underrepresented in academic medicine and other leadership positions. The fact that large bias persists for U.S.-born Black residents relative to their non-U.S. born counterparts, is troubling and suggests that there is a cultural history deeply seated in racism in the United States that specifically affects bias against Black Americans.”

The researchers found that the new Milestone system, which uses a more structured evaluation format, may mitigate individual bias that affected earlier reviews. But the authors said more research is needed to understand why bias ratings vary among groups based on birthplace. They noted that non-U.S.-born Latino residents and Asian residents reported much greater bias before the new system was adopted than their U.S.-born counterparts. The new system reduced bias against all foreign-born minorities, the report notes.

“Notably, we are still living with the ramifications of substantial bias because most practicing physicians trained during the pre-Milestone period, when bias estimates against all of the minoritized groups that we studied were large,” the report says. That in turn may help explain the cascading effect slowing advancement for minority physicians into leadership positions.

Banner photo by Charles Rex Arbogast/AP

Get smart & reliable public policy insights right in your inbox.