Making Machine Learning Models More Trustworthy

I worked with the UC Berkeley Global Policy Lab to make machine learning models friendlier to policy makers and officials.

The CIDER project is a software package that predicts poverty using cellphone data, offering a robust way for governments to allocate financial aid without having to conduct expensive censuses. It is based on the paper, Machine learning and phone data can improve targeting of humanitarian aid (Aiken et al, 2022). Emily and the team at Berkeley, and the team at GiveDirectly, are all fantastic people and their work on this project has been incredible.

I had the privilege of working with them to implement fairness and explainability features to the codebase, allowing the results to be more easily interpretable for policymakers and other non-technical users.


We can treat the model as a classifier, which takes in as an input the cell phone information of a person, and outputs whether or not they might be eligible for aid. Like any classifier, this might result in false positives (mistakenly saying someone is eligible) or false negatives (mistakenly saying someone is not eligible). 

The idea of fairness is that the model should not perform disproportionately worse on different groups of people. For instance, if the model has a ton of false positives for men, and a ton of false positives for women, then simply being a woman gives someone a lower chance of receiving aid. The predictions of the model on different groups should be similarly close to the ground truth data.

The fairness module visualizes the model’s performance on different groups in a variety of ways, allowing policymakers to ensure that the model is fair.


As models get more and more complex, they become less and less interpretable. Random forests and neural networks are almost black boxes, and any individual prediction is almost impossible to explain.

However, the performance of a model can be approximated locally with a low-complexity model, such as a linear regression model, which has weights that are much easier to interpret.

Therefore, to explain the prediction of a complex model with any given point, approximate the model’s behavior with a linear model near that point, then plug the point into that linear model. This will give you a readable list of features and how much they contribute to the prediction variable.

Here is a slideshow that I presented at a lab meeting at the end of this semester, with a few examples of fairness and explainability plots.

<iframe src="" frameborder="0" width="1440" height="839" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>