M-RCBG Associate Working Paper No. 130
Model Behavior: Mitigating Bias in Public Sector Machine Learning Applications
Sara Kassir
2019
Abstract
As the burdens for collecting enormous amounts of data decreased in recent years, advanced methods of analyzing this information rapidly developed. Machine learning (ML), or the automation of model building, is one such method that quickly became ubiquitous and impactful across industries. For the public sector, artificially intelligent algorithms are now being deployed to solve problems that were previously viewed as insurmountable by humans. In international development, they are working to predict areas susceptible to famine; in regulation, they are detecting the sources of foodborne illness; in medicine, they are adding greater speed and precision to diagnostic processes.
The advancements presented by Big Data and ML are undeniably promising, but the technology also poses significant risks, particularly when algorithms are assumed to be infallible. While it may be true that these applications process any information they are given “objectively,” human-generated data invariably reflects human biases. Therefore, automated tools can end up entrenching problematic simplifications about the world, though under the unfortunate guise of neutrality. Concerns about “racist robots” and “sexist machines” have created mounting pressure for government intervention on artificial intelligence, but the form such action would take is unclear. Private sector industry players have also experienced calls to proactively address the issue.
This paper seeks to make the challenges surrounding machine learning actionable by contextualizing the technology alongside other modern innovations that have generated ambiguous risks. Like the harms of nuclear radiation or cyberattacks, the consequences of algorithmic bias are imprecise and context-dependent, often surprising organizations with unforeseen consequences. As history has shown, environments driven by such uncertainty tend to be misinterpreted by “top-down” regulatory frameworks. Instead, ambiguous risks are better handled through internal organizational structures that promote accountability and robust due diligence.
In the case of machine learning applications, this paper argues that accountability and robust due diligence are best achieved through a process known as algorithmic auditing. By assessing the ways in which bias might emerge at each step in the technical development pipeline, it is possible to develop strategies for evaluating each aspect of a model for undue sources of influence. Further, because algorithmic audits encourage systematic engagement with the issue of bias throughout the model-building process, they can also facilitate an organization’s broader shift toward socially responsible data collection and use.