Quantifying and debiasing AI bias

updated: 2023-01-09

#ai-bias

Understanding and quantifying bias manifold

Understanding the bias manifold in embedding is instrumental in designing debiasing algorithms. This seminal work demonstrates that word embedding has a linear axis accounting for gender-bias, i.e., bias dimensions.

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

Some word emebdding models including Glove and SGNS word2vec embed bias into a linear subspace because the word associations are captured by frequency ratios.

The conjecture is provided by GloVe: Global Vectors for Word Representation - ACL Anthology.
A formal proof is provided by Towards Understanding Linear Word Analogies - ACL Anthology

The linearity of bias manifold has led to a metric to quantiy the degree of bias in emebdding.

WEAT statistics: Semantics derived automatically from language corpora contain human-like biases | Science
Generalized WEAT for multi-categories: What are the Biases in My Word Embedding? | Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

WEAT statistics are common metrics, although they tend to overestimate the bias and not able to capture non-linear bias:

The picture might look more complex than initially thought, as the bias manifold can be impacted by seemingly inconsequential factors such as word popularity.

One can quantify the local manifold of bias spanned by the k-nearest neighbors:

Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them - ACL Anthology

A word takes a distribution, instead of a point in space, when the embedding is contextual, as a word representation varies depending on the surrounding words. This calls a new framework to probe bias manifold:

Identifying bias

Perturbation analysis can bring out harmful data samples that induce bias:

Understanding the Origins of Bias in Word Embeddings

Alternatively, one can identify the bias a posteriori from the embedding:

What are the Biases in My Word Embedding? | Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

Debiasing

Debiasing by modifying input data:
Linear projection:
- Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
- Iterative linear projection: Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection - ACL Anthology
Debiasing by training

Quantifying and debiasing AI bias

Understanding and quantifying bias manifold

Identifying bias

Debiasing

Debiasing graph embedding