As regulators repeatedly warn banks and fintechs that their artificial intelligence models have to be transparent, explainable, fair and free of bias, especially when making loan decisions, banks and fintechs are taking extra steps to prove that their algorithms meet all of those requirements.
A case in point is Upgrade, a San Francisco-based challenger bank that provides mobile banking, personal loans, a hybrid debit and credit card, a credit builder card, auto loans and home improvement loans to five million consumers. Upgrade is partnering with an “embedded fairness” provider called FairPlay to back-test and monitor its models in real time to make sure the decisions supported by the models are free of bias. FairPlay already works with 25 banks and fintechs, including Varo Bank, Figure and Octane Lending.
“What [the partnership with FairPlay] is accomplishing for us is making sure we are fair and compliant and making appropriate credit decisions that don’t have a disparate impact on any protected category,” said Renaud Laplanche, founder and CEO of Upgrade, in an interview. Over time, Upgrade plans to apply FairPlay to all its credit products.
Banks, fintechs and the banking-as-a-service ecosystem have been under a lot of regulatory scrutiny lately. High on the list of supervisory and enforcement issues has been fair lending, because regulators are concerned that banks and fintechs are using alternative credit data and advanced AI models in ways that can be hard to understand and explain, and where bias can creep in. In some recent consent orders, regulators have demanded that banks monitor their lending models for fairness.
These concerns are not new. Financial firms have been using AI in their lending models for years, and regulators have made clear from the start that they have to comply with all applicable laws, including the Equal Credit Opportunity Act and the Fair Housing Act, which prohibit discrimination based on characteristics such as race.
One aspect of the software that Laplanche was drawn to was its ability to monitor in real time.”That’s where it gets more effective and simpler to use, as opposed to doing a periodic audit and shipping data to third parties and then getting the results back a few weeks or months later,” Laplanche said. “Here you have this continuous service that’s always running, that can pick up signals very quickly, that can help us make adjustments very quickly. We like the fact that it’s embedded and it’s not a batch process.”
But proving that AI-based lending models are not discriminatory is a newer frontier.
“There’s an emerging consensus that if you want to use AI and big data, that you have to take the biases that are inherent in these systems really seriously,” said Kareem Saleh, founder and CEO of FairPlay, in an interview. “You have to inquire into those biases rigorously, and you’ve got to commit yourself with seriousness and purpose to fixing issues if you find them.”
Upgrade is showing a lot of leadership, both for itself and the industry, Saleh said, in stepping up its compliance technology in this area.
Upgrade makes loan decisions using a machine learning technique called gradient boosting. (Behind the scenes, the company’s personal loans and auto refinance loans are made by partners Cross River Bank and Blue Ridge Bank. Home improvement loans and personal credit lines also are made by Cross River Bank, which issues the Upgrade Card.) About 250 banks buy Upgrade’s loans.
Banks that buy loans from Upgrade and other fintechs look for evidence of compliance with the Equal Credit Opportunity Act and other laws that regulate lending. On top of that, Upgrade has its own compliance requirements, as do its bank partners and the banks that buy its loans. FairPlay’s APIs will keep an eye on all of these. They will back-test and monitor its models for signs of anything that could impact any group adversely.
One aspect of the software that Laplanche was drawn to was its ability to monitor in real time.
“That’s where it gets more effective and simpler to use, as opposed to doing a periodic audit and shipping data to third parties and then getting the results back a few weeks or months later,” Laplanche said. “Here you have this continuous service that’s always running, that can pick up signals very quickly, that can help us make adjustments very quickly. We like the fact that it’s embedded and it’s not a batch process.”
FairPlay’s software is most commonly used to back-test lending models. It will run a model against loan applications from two years ago and see how that model would’ve performed if it had been in production back then.
“Then it’s possible to make some reasonable estimates about what the outcomes of that model would be on different groups,” Saleh said.
If the back testing turns up a problem, like disproportionate lending to white men over women and minorities, then the software can be used to determine which variables are driving disparate outcomes for the different groups.
Once those are identified, the question is, “Do I need to rely on those variables as much as I do?” Saleh said. “Are there some other variables that might be similarly predictive but have less of a disparity driving effect? All of those questions can only be asked if you take that first step of testing the model and saying, what are the outcomes for all of these groups?”
Women who left the workforce for several years to raise children, for instance, have inconsistent income, which looks like a big red flag to a loan underwriting model. But information about the credit performance of women can be used to adjust the weights on the variables in ways that make the model more sensitive to women as a class, Saleh said.
A Black person who grew up in a community where there were no bank branches, and therefore mostly used check cashers, is unlikely to have a high FICO score and may not have a bank account. In a case like this, Saleh said, a model might be adjusted to reduce the influence of credit score and tune up the influence of consistent employment.
Such adjustments can “allow the model to capture these populations that it was previously insensitive to because of over-reliance on certain pieces of information,” Saleh said.
FairPlay’s back testing can be done on underwriting models of all kinds, from linear and logistic regression to advanced machine learning models, Saleh said.
“The AI models are where all of the action is these days,” Saleh said. More advanced AI models are harder to explain. So it’s harder to understand what variables drove their decisions and they can consume a lot more information and they can consume information that’s messy, missing or wrong. That makes the fairness analysis much more subtle than a world where you’re dealing with a relatively explainable model and data that’s largely present and correct.”
As it monitors the outcomes of models, FairPlay can be used to detect unfair behavior and suggest changes or corrections.
“If the fairness starts to degrade, we try to understand why,” Saleh said. “How do we make sure that the underwriting stays fair, in a dynamically changing economic environment? Those are questions that have never really been asked or grappled with before.”
FairPlay began offering real-time monitoring relatively recently. Because technology and economic conditions have been changing quickly, “episodic testing is no longer sufficient,” Saleh said.
Technology like FairPlay’s is important, Patrick Hall, a professor at George Washington University who has been involved in the NIST AI risk management framework. He considers FairPlay’s software a credible tool.
“People are certainly going to need good tools,” Hall said. But they have to go along with processes and culture to really have any effect.”
Good modeling culture and processes include making sure the programmer teams have some diversity.
“More diverse teams have fewer blind spots,” Hall said. This doesn’t just mean demographic diversity, but having people with a wide array of skills – including economists, statisticians and psychometricians.
Good processes include transparency, accountability and documentation.
“It’s just old fashioned governance,” Hall said. “If you train this model, you have to write a document on it. You have to sign that document, and you may actually experience consequences if the system doesn’t work as intended.”
Originally posted on The American Banker.