Lesson 8.2 — AI Bias: Where It Comes From and What You Can Do

Diverse group of people looking at a screen together

Bias in AI systems is not a matter of AI having opinions or bad intentions. It is a structural problem: AI models learn patterns from data created by humans in a world that has always contained inequalities, assumptions, and blind spots. Those patterns get embedded in the model's outputs, often invisibly, where they can affect real people in ways that compound existing disadvantages.

Understanding where bias comes from, what it looks like in practice, and how to mitigate it as a user is not a theoretical exercise. It is a practical skill for anyone using AI in contexts that affect people.

Four Sources of Bias

Source 1: Training Data Bias

AI models learn from large datasets scraped from the internet, digitised text, and human-curated sources. Those sources reflect the world as it was when they were written — including who wrote them, who was represented in them, and what assumptions the authors held.

If historical text describes nurses as "she" and engineers as "he" far more often than the reverse, a model trained on that text learns those statistical associations. It does not know this is a bias — it has learned a pattern.

Example: Image generation models trained predominantly on Western photography may default to generating images of "a professional" or "a CEO" as white men unless explicitly prompted otherwise. This is not because the model intends discrimination — it is reflecting the demographic distribution in its training data.

Source 2: Label Bias

Many AI systems are trained using human-labelled data — people who look at examples and categorise them ("this is appropriate / inappropriate," "this face looks trustworthy / untrustworthy," "this resume is strong / weak"). Those human labellers bring their own biases, conscious or not, and those biases get baked into the training signal.

Source 3: Feedback and Reinforcement Bias

Models refined through human feedback (RLHF — the technique used to make AI more helpful and safe) are shaped by the preferences of the people providing feedback. If those people skew in any direction — demographically, culturally, or philosophically — the model's outputs will too.

Source 4: Historical Bias in Outcomes

When AI systems predict outcomes based on historical data, they can perpetuate historical injustice. If past lending decisions, hiring decisions, or sentencing decisions were biased, a model trained on those outcomes will learn to replicate those patterns.

Key takeaway: Bias in AI is not primarily a technical problem. It is a data problem, a human problem, and a structural problem. Better algorithms alone cannot fix it.

Types of Bias: A Reference Table

Bias type	Description	Example
Representation bias	Some groups are underrepresented in training data	Medical AI trained mostly on data from white patients performs worse for others
Measurement bias	Proxies used for a quality introduce inequity	Using ZIP code as a proxy for creditworthiness penalises redlined areas
Aggregation bias	One model is used for groups with different characteristics	A single diabetes risk model applied across racial groups with different risk profiles
Evaluation bias	Benchmark datasets don't represent the deployment population	A facial recognition system benchmarked on light-skinned faces underperforms on dark-skinned ones
Historical bias	Data reflects past discrimination	Hiring AI trained on historical hire data perpetuates past biases
Linguistic bias	Models perform better on dominant languages and dialects	Sentiment analysis works poorly on African-American Vernacular English

Two Documented Real-World Cases

COMPAS Recidivism Algorithm

COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) was widely used in the US criminal justice system to predict an individual's likelihood of reoffending — informing decisions about bail, sentencing, and parole.

In 2016, ProPublica published an analysis showing that COMPAS was significantly more likely to incorrectly label Black defendants as high risk when they were not, compared to white defendants. Conversely, white defendants who did reoffend were more likely to have been labelled lower risk. The algorithm did not use race as an input variable — but it used proxies correlated with race (neighbourhood, prior contact with law enforcement) that carried the same discriminatory effect.

The COMPAS case illustrates a critical point: removing a protected characteristic from a model does not remove bias if proxies for that characteristic remain.

Amazon Hiring Tool

Amazon developed an AI recruiting tool to review CVs and score candidates for technical roles. The system was trained on historical CVs submitted to Amazon over a 10-year period — predominantly from men, reflecting the demographics of the tech industry.

The model learned to penalise CVs that included words like "women's" (as in "women's chess club" or "women's university") and downgraded graduates of all-women's colleges. Amazon abandoned the tool in 2018 after discovering these issues.

This case shows how historical data, reflecting existing industry demographics, can actively reproduce and reinforce those demographics in a system designed to be objective.

What Users Can Do: Practical Mitigation Strategies

Strategy 1: Audit your prompts for assumptions

Your prompts encode your assumptions. "Write a story about a successful entrepreneur" will likely produce a certain demographic profile without explicit instruction. Ask yourself: am I specifying something neutral, or am I assuming a default?

Better prompts:

"Write a story about a successful entrepreneur — specifically a Latina woman in her 50s"
"Describe a nurse — vary the age, background, and gender across three different descriptions"

Strategy 2: Request diverse perspectives explicitly

When generating content about people, topics that affect different groups, or analyses of human behaviour, explicitly ask for diverse representation:

"In your response, ensure the examples represent a diverse range of ages, genders, ethnicities, and backgrounds."

Strategy 3: Test for differential performance

If you are using AI in a professional context that affects people (hiring, lending, healthcare, education), test whether it performs equally across different groups. Run the same scenario with different demographic details and compare outputs.

Strategy 4: Treat AI outputs on social topics with extra scepticism

For topics involving race, gender, religion, disability, or socioeconomic status, apply more critical scrutiny than you would to neutral topics. Ask: whose perspective does this reflect? What is it not saying?

Strategy 5: Maintain human review for high-stakes decisions

Never delegate high-stakes decisions that affect people's lives — hiring, lending, health assessments, performance evaluations — entirely to AI without human oversight. AI can inform; humans should decide.

A Note on AI Safety Measures

Most major AI providers have implemented safeguards designed to reduce harmful outputs — content policies, trained refusals, and regular red-teaming (deliberate testing for bias and harm). These measures have meaningfully reduced some categories of bias and harmful output.

They have not eliminated bias. The nature of the problem — statistical patterns in massive datasets — means that perfect solutions do not yet exist. Safeguards also create their own problems: over-cautious refusals that prevent legitimate use, and inconsistent application across demographic groups.

This is not a reason to avoid AI. It is a reason to use it thoughtfully, to maintain oversight where it matters, and to advocate for the kind of transparency and accountability that makes it possible to hold AI systems responsible when they fail.

Practice Task

Try this experiment: ask an AI image generator (DALL-E 3 or Midjourney) to generate "a doctor," "a nurse," "a criminal," and "a CEO" — with no other instructions. Look at what demographics appear across the images. Then generate the same prompts with explicit demographic instructions and compare. This exercise makes statistical bias visible in a way that is difficult to ignore.