Modules/Module 8/Lesson 7
Lesson 7 of 7 ~10 min read

How to Verify AI Outputs

Lesson 8.7 — How to Verify AI Outputs

Person checking facts on a laptop with notepad

We have covered AI hallucination, bias, and the importance of critical thinking. This lesson brings it all together into a practical, actionable system. The goal is not to verify everything — that would eliminate the time savings that make AI useful. The goal is to verify the right things, at the right level of depth, proportionate to the stakes.


The Core Principle: Risk-Proportionate Verification

Not all AI output carries the same risk. A creative brainstorm that you will further develop has essentially no verification requirement. A statistical claim you are about to present to your company's board requires serious checking. Most of what you do with AI falls somewhere in between.

The question to ask before any significant use of AI output: What is the worst realistic outcome if this turns out to be wrong?

Low consequence (can be fixed easily, affects only you): light verification. Medium consequence (affects others, reputational risk, moderate financial stakes): meaningful verification. High consequence (legal, medical, financial, safety, public claims): thorough verification.


Level 1: Low-Risk Output (Light Verification)

Examples: Creative writing, brainstorming, personal planning, first drafts you will heavily edit, learning new topics for your own interest.

Verification approach:

  • Read for obvious logical errors or contradictions
  • Check that the output actually addresses what you asked
  • Apply basic common sense: does this sound plausible given what you know?

Time investment: 1–3 minutes.

What you're not doing: Checking every fact, following up citations, seeking expert opinion.


Level 2: Medium-Risk Output (Meaningful Verification)

Examples: Professional communications you will send, research you will reference in a presentation, advice you will act on, information you will share with others.

Verification approach:

  1. Identify the specific factual claims. Not the general thrust of the answer, but the concrete, verifiable assertions: numbers, dates, names, cited examples, procedures.
  2. Check those claims against an independent source. Not another AI. An actual source — a news article, official documentation, a domain expert.
  3. Look for what's missing. AI responses often omit important caveats, competing viewpoints, or contextual information. Ask yourself: "Is this the whole picture?"
  4. Check for recency. AI training has a knowledge cutoff. Anything time-sensitive (regulations, current events, pricing, recent research) may be out of date.

Time investment: 5–20 minutes depending on complexity.


Level 3: High-Risk Output (Thorough Verification)

Examples: Medical information you will act on or pass to others, legal or financial information informing a significant decision, claims you will publish, information that will affect someone's safety or welfare.

Verification approach:

  1. All steps from Level 2, plus:
  2. Trace key claims to primary sources. Not a news article summarising a study — the actual study. Not a website summarising a regulation — the actual regulation.
  3. Seek expert review. A qualified human should review outputs with significant medical, legal, financial, or technical implications.
  4. Check the AI's reasoning, not just its conclusion. Ask the AI to explain how it reached its conclusion. A well-reasoned answer with visible logic is easier to evaluate than a confident assertion.
  5. Test with alternative queries. Ask the same question differently, from a sceptical angle: "What are the arguments against this approach?" A claim that holds up under challenge is more reliable than one that doesn't.

Time investment: 30 minutes to several hours — or delegated to qualified professionals.


SIFT Adapted for AI Outputs

The SIFT framework (from Lesson 8.4) applies equally well to evaluating AI-generated text:

S — Stop Before sharing, acting on, or embedding AI output in your work, pause. The fluency and confidence of AI writing can create an impression of authority that the content may not deserve.

I — Investigate the source of key claims For any specific factual claim in an AI response, ask: "Where does this come from?" If the AI cites a source, go and look at it. If it doesn't cite one, search for independent confirmation.

F — Find better coverage For any claim that matters, can you find it confirmed in multiple independent, credible sources? A claim that only appears in AI output and is not independently confirmed by authoritative sources deserves scepticism.

T — Trace claims to original context AI models have absorbed enormous quantities of information, but context is often lost in that process. A study conducted in one country may be cited as a universal finding. A minority position may be presented as consensus. Tracing back to the original source often reveals important context the AI omitted.


Three Worked Verification Examples

Example 1: AI Claims a Statistic

AI output: "Studies show that people who exercise in the morning are 40% more likely to stick to their fitness routines than those who exercise later in the day."

Risk level: Medium — you are considering sharing this in a social media post.

Verification process:

  1. Search "morning exercise adherence 40 percent study" — does this appear in search results from credible sources?
  2. If you find a study, check: who conducted it, what was the sample size, was it peer-reviewed, how was "stick to routine" defined?
  3. Check whether the 40% figure appears in the original study or whether it was generated by the AI paraphrasing (or confabulating) from a more nuanced finding.

Likely finding: Statistics like this are frequently hallucinated or distorted paraphrases of real research with different numbers, different definitions, or important caveats. If you cannot find the original source, do not use the statistic.


Example 2: AI Summarises a Legal Rule

AI output: "Under UK employment law, employers must give a minimum of one week's notice per year of service, up to a maximum of 12 weeks."

Risk level: High — you are considering giving this information to an employee or acting on it as an employer.

Verification process:

  1. Verify the specific numbers on GOV.UK (the authoritative source for UK employment law)
  2. Check whether the rule applies straightforwardly to your situation or whether there are exceptions (contracts with longer notice periods, probationary periods, gross misconduct situations)
  3. For anything with financial or legal consequence, consult an employment solicitor or HR professional

Actual check: The statutory minimum notice periods in UK law are: 1 week after 1 month's employment, up to a maximum of 12 weeks after 12+ years. The AI's summary was approximately correct in this case, but "approximately correct" is not reliable enough for legal matters.


Example 3: AI Provides a Historical Account

AI output: "The Great Fire of London began on 2 September 1666 in a bakery on Pudding Lane and burned for three days, destroying over 13,000 houses and 87 churches. The fire was largely responsible for the spread of the black plague ceasing in London."

Risk level: Medium — you are writing educational content.

Verification process:

  1. Check the main facts against an encyclopaedia or history source
  2. Note the claim about the plague — this is a persistent folk history claim worth specifically checking

What verification reveals: The date, location, and scope of the fire are accurate. The plague claim is contested. While the fire occurred shortly after a major plague outbreak, historians note that the plague was already declining before the fire. The fire as a "plague cure" is a popular narrative that the historical record does not cleanly support. The AI presented a contested claim as straightforward fact.


Workflow Integration

Verification works best when it is built into your workflow rather than treated as an optional extra. Here are three integration approaches:

For writing: After generating a draft, highlight all specific factual claims before doing anything else. These are your verification queue. Do not edit for style until facts are confirmed.

For research: Use AI to identify what to look for, not to be the source itself. "What are the key studies on [topic]?" is a better use of AI than "What do studies say about [topic]?" because it directs you to sources you can then read yourself.

For professional advice (medical, legal, financial): Use AI to prepare your questions and understand the context before a conversation with a qualified professional. Never use AI output as a substitute for that conversation.

Key takeaway: The three-level risk triage — low, medium, high — is the key practical tool. Most people who get into trouble with AI output are not being careless. They just haven't paused to ask what level of verification the stakes require.


Quick Reference

Before using any significant AI output, ask:

  1. What is the worst realistic outcome if this is wrong?
  2. Does the risk level require light, meaningful, or thorough verification?
  3. Have I identified the specific factual claims (not the general argument)?
  4. Have I checked those claims against independent sources?
  5. Have I looked for what might be missing — caveats, competing views, context?

That five-question checklist takes two minutes and prevents the majority of problems that come from uncritically using AI output.


Practice Task

For the next week, every time you use an AI output for something you will share, send, or act upon — run the five-question checklist. Note which outputs required significant correction and which were reliable as-is. Over time, this builds an accurate intuition for which types of AI tasks require more scrutiny in your own workflow.


You've finished all the lessons in Module 8. Take the quiz to test your knowledge →