Bias in AI
Lesson 8.2 — AI Bias: Where It Comes From and What You Can Do
Bias in AI systems is not a matter of AI having opinions or bad intentions. It is a structural problem: AI models learn patterns from data created by humans in a world that has always contained inequalities, assumptions, and blind spots. Those patterns get embedded in the model's outputs, often invisibly, where they can affect real people in ways that compound existing disadvantages.
Understanding where bias comes from, what it looks like in practice, and how to mitigate it as a user is not a theoretical exercise. It is a practical skill for anyone using AI in contexts that affect people.
Four Sources of Bias
Source 1: Training Data Bias
AI models learn from large datasets scraped from the internet, digitised text, and human-curated sources. Those sources reflect the world as it was when they were written — including who wrote them, who was represented in them, and what assumptions the authors held.
If historical text describes nurses as "she" and engineers as "he" far more often than the reverse, a model trained on that text learns those statistical associations. It does not know this is a bias — it has learned a pattern.
Example: Image generation models trained predominantly on Western photography may default to generating images of "a professional" or "a CEO" as white men unless explicitly prompted otherwise. This is not because the model intends discrimination — it is reflecting the demographic distribution in its training data.
Source 2: Label Bias
Many AI systems are trained using human-labelled data — people who look at examples and categorise them ("this is appropriate / inappropriate," "this face looks trustworthy / untrustworthy," "this resume is strong / weak"). Those human labellers bring their own biases, conscious or not, and those biases get baked into the training signal.
Source 3: Feedback and Reinforcement Bias
Models refined through human feedback (RLHF — the technique used to make AI more helpful and safe) are shaped by the preferences of the people providing feedback. If those people skew in any direction — demographically, culturally, or philosophically — the model's outputs will too.
Source 4: Historical Bias in Outcomes
When AI systems predict outcomes based on historical data, they can perpetuate historical injustice. If past lending decisions, hiring decisions, or sentencing decisions were biased, a model trained on those outcomes will learn to replicate those patterns.
Key takeaway: Bias in AI is not primarily a technical problem. It is a data problem, a human problem, and a structural problem. Better algorithms alone cannot fix it.
Types of Bias: A Reference Table
| Bias type | Description | Example |
|---|---|---|
| Representation bias | Some groups are underrepresented in training data | Medical AI trained mostly on data from white patients performs worse for others |
| Measurement bias | Proxies used for a quality introduce inequity | Using ZIP code as a proxy for creditworthiness penalises redlined areas |
| Aggregation bias | One model is used for groups with different characteristics | A single diabetes risk model applied across racial groups with different risk profiles |
| Evaluation bias | Benchmark datasets don't represent the deployment population | A facial recognition system benchmarked on light-skinned faces underperforms on dark-skinned ones |
| Historical bias | Data reflects past discrimination | Hiring AI trained on historical hire data perpetuates past biases |
| Linguistic bias | Models perform better on dominant languages and dialects | Sentiment analysis works poorly on African-American Vernacular English |
Two Documented Real-World Cases
COMPAS Recidivism Algorithm
COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) was widely used in the US criminal justice system to predict an individual's likelihood of reoffending — informing decisions about bail, sentencing, and parole.
In 2016, ProPublica published an analysis showing that COMPAS was significantly more likely to incorrectly label Black defendants as high risk when they were not, compared to white defendants. Conversely, white defendants who did reoffend were more likely to have been labelled lower risk. The algorithm did not use race as an input variable — but it used proxies correlated with race (neighbourhood, prior contact with law enforcement) that carried the same discriminatory effect.
The COMPAS case illustrates a critical point: removing a protected characteristic from a model does not remove bias if proxies for that characteristic remain.
Amazon Hiring Tool
Amazon developed an AI recruiting tool to review CVs and score candidates for technical roles. The system was trained on historical CVs submitted to Amazon over a 10-year period — predominantly from men, reflecting the demographics of the tech industry.
The model learned to penalise CVs that included words like "women's" (as in "women's chess club" or "women's university") and downgraded graduates of all-women's colleges. Amazon abandoned the tool in 2018 after discovering these issues.
This case shows how historical data, reflecting existing industry demographics, can actively reproduce and reinforce those demographics in a system designed to be objective.
What Users Can Do: Practical Mitigation Strategies
Strategy 1: Audit your prompts for assumptions
Your prompts encode your assumptions. "Write a story about a successful entrepreneur" will likely produce a certain demographic profile without explicit instruction. Ask yourself: am I specifying something neutral, or am I assuming a default?
Better prompts:
- "Write a story about a successful entrepreneur — specifically a Latina woman in her 50s"
- "Describe a nurse — vary the age, background, and gender across three different descriptions"
Strategy 2: Request diverse perspectives explicitly
When generating content about people, topics that affect different groups, or analyses of human behaviour, explicitly ask for diverse representation:
"In your response, ensure the examples represent a diverse range of ages, genders, ethnicities, and backgrounds."
Strategy 3: Test for differential performance
If you are using AI in a professional context that affects people (hiring, lending, healthcare, education), test whether it performs equally across different groups. Run the same scenario with different demographic details and compare outputs.
Strategy 4: Treat AI outputs on social topics with extra scepticism
For topics involving race, gender, religion, disability, or socioeconomic status, apply more critical scrutiny than you would to neutral topics. Ask: whose perspective does this reflect? What is it not saying?
Strategy 5: Maintain human review for high-stakes decisions
Never delegate high-stakes decisions that affect people's lives — hiring, lending, health assessments, performance evaluations — entirely to AI without human oversight. AI can inform; humans should decide.
A Note on AI Safety Measures
Most major AI providers have implemented safeguards designed to reduce harmful outputs — content policies, trained refusals, and regular red-teaming (deliberate testing for bias and harm). These measures have meaningfully reduced some categories of bias and harmful output.
They have not eliminated bias. The nature of the problem — statistical patterns in massive datasets — means that perfect solutions do not yet exist. Safeguards also create their own problems: over-cautious refusals that prevent legitimate use, and inconsistent application across demographic groups.
This is not a reason to avoid AI. It is a reason to use it thoughtfully, to maintain oversight where it matters, and to advocate for the kind of transparency and accountability that makes it possible to hold AI systems responsible when they fail.
Practice Task
Try this experiment: ask an AI image generator (DALL-E 3 or Midjourney) to generate "a doctor," "a nurse," "a criminal," and "a CEO" — with no other instructions. Look at what demographics appear across the images. Then generate the same prompts with explicit demographic instructions and compare. This exercise makes statistical bias visible in a way that is difficult to ignore.