Imagine needing a loan but fearing your social media habits could sabotage your chances.
Privacy-Fairness Tradeoffs like these are increasingly being faced by our clients when using alternative data in high-stakes applications like underwriting.
Lenders using AI systems rely on extensive data, including personal, financial, and behavioral information, for underwriting and other decisions. This raises privacy concerns, as sensitive details about financial status, spending habits, and social behaviors are collected. The volume and granularity of this data heighten the risk of unauthorized access and misuse.
Additionally, lenders often share this data with third parties for processing, analytics, and decision support, increasing privacy risks by expanding the number of entities with access to the data.
As a result, policymakers and consumer advocates are calling for the use of privacy-enhancing technologies (PETs) to manage sensitive data. PETs range from stripping datasets of identifiable information to using synthetic data that mimics real datasets for AI training. These techniques aim to obscure or delete demographic identifiers, reducing the risk of data misuse.
But what if these privacy-preserving techniques make it difficult to assess the fairness of algorithmic decision-making—or worse, introduce disparities themselves?
A recent study found that while PETs can protect individual privacy, they introduce fairness risks by making it harder to identify and rectify biases in AI systems.
This is where privacy-fairness tradeoffs come in.
Generally, the less a model knows about its training data, the less analytical output can be derived.
At the same time, sensitive information about race, gender, and age can be illegal to use as model inputs but essential for fairness assessments, like comparing approval rates between men and women or checking if Hispanic applicants face higher interest rates.
This puts lenders in a challenging position: balancing individual privacy rights with ensuring their algorithms do not discriminate.
Fortunately, there are ways to address privacy-fairness tradeoffs.
First, regulators understand this challenge. Laws in the U.S. and Europe offer protections and exemptions for self-tests to assess algorithmic bias, acknowledging the need for sensitive data like race in these analyses.
Second, lenders can use data minimization strategies such as anonymization, encryption, and synthetic data generation to protect privacy while enabling fairness assessments.
Third, technologists are developing fairness-aware PETs designed to preserve fairness while protecting privacy. For example, federated learning allows firms to collaboratively analyze data without sharing it.
I’d be interested to know what the privacy practitioners out there think, but it seems to me that navigating privacy-fairness tradeoffs will become tougher as AI and alternative data become more pervasive in high-stakes domains.