Lesson 8.3 — Privacy and AI: What Happens to Your Data

Privacy lock icon over digital data streams

Every time you type a message into an AI tool, something happens to that text. It travels across the internet, is processed by servers in data centres, may be stored, may be reviewed by humans, and in some cases is used to train future versions of the model. Most people have not read the privacy policies of the tools they use every day. This lesson gives you what you actually need to know.

What Happens When You Type a Prompt

When you submit a message to an AI tool, the journey typically looks like this:

Your text is transmitted to the company's servers, usually encrypted in transit (this is standard and expected)
The model processes your input and generates a response
Your conversation may be stored — how long, and for what purpose, varies by provider
Human reviewers may read conversations as part of safety review and quality improvement programmes
Your data may be used to train future models — unless you opt out, where opt-out is available

The key variables across platforms are: how long data is stored, whether humans review it, and whether you can opt out of training data use.

Platform Data Handling Comparison

Platform	Default storage	Human review	Training data use	Opt-out available
ChatGPT (free)	Conversations stored; used for training by default	Yes, by human reviewers	Yes, by default	Yes — in Settings → Data Controls → Improve model
ChatGPT (Plus/Team)	Stored; training opt-out available	Yes, with reduced review	Opt-out available	Yes
ChatGPT (Enterprise/API)	Not used for training by default	No by default	No by default	N/A — off by default
Claude (claude.ai free)	Stored; may be used for training	Possible	May be used	Limited
Claude (API / Claude.ai Pro)	Anthropic commits to not using for training by default	Limited	No	N/A — off by default
Grok (free / grok.com)	Conversations may be used to improve xAI models	Possible	By default	Limited — check account settings
Grok (X Premium)	Same as free; X's privacy policy also applies	Possible	By default	Via X account settings
Gemini (Google, free)	Stored; reviewed by human raters	Yes	Yes, by default	Yes — in Google Account → Gemini Apps Activity
Gemini Advanced	Stored; less training use	Reduced	Limited	Available
Microsoft Copilot (consumer)	Microsoft privacy policy applies	Possible	Yes	Via privacy settings
Microsoft Copilot (enterprise)	Enterprise data commitments apply	No	No	N/A

Note: Privacy policies change. Before relying on these details for important decisions, check the current policy directly on each platform.

What You Should Never Type into Public AI Tools

Regardless of platform, certain categories of information should never be entered into consumer AI tools:

Personal data of others:

Full names + addresses + financial information of customers or clients
Medical records or health information about specific individuals
Employee personal data

Confidential business information:

Unreleased product plans, revenue figures, or strategic plans
Client names combined with sensitive contract details
Proprietary source code or trade secrets

Your own sensitive data:

Passwords, API keys, or authentication credentials
Full financial account details
Information that could identify you in combination (e.g. full name + date of birth + medical condition)

Legal matter details:

Ongoing legal disputes with identifying information
Attorney-client communications

The rule of thumb: if this text appeared in a data breach, would there be consequences? If yes, keep it out of consumer AI tools.

The Samsung Incident

In April 2023, Samsung engineers were discovered to have entered sensitive internal code and meeting minutes into ChatGPT on three separate occasions within a 20-day period. One engineer pasted source code for a semiconductor production tool to ask for optimisation help. Another uploaded meeting notes. A third asked the AI to convert meeting notes into a presentation.

Samsung had granted employees access to ChatGPT for legitimate work purposes but had not yet established clear guidance about what could be shared. The code and meeting content became part of OpenAI's training data.

Samsung subsequently restricted internal use of external AI tools and began developing internal AI systems. Multiple other large companies — Apple, JPMorgan, Amazon — issued similar restrictions around the same time.

What this illustrates: The risk is not theoretical. Employees making reasonable-seeming decisions in the course of work can inadvertently share highly sensitive data. The solution is policy and education, not just technology.

How to Opt Out of Training Data Use

ChatGPT

Log into chat.openai.com
Click your profile picture → Settings
Go to Data Controls
Toggle off "Improve the model for everyone"

Once opted out, your future conversations will not be used for training. Past conversations already submitted cannot be retroactively withdrawn.

Google Gemini

Log into your Google Account
Go to myaccount.google.com
Navigate to Data & Privacy → Web & App Activity
Find Gemini Apps Activity and turn it off

Or directly at gemini.google.com → Settings → Gemini Apps Activity

Claude (Anthropic)

Anthropic's consumer product has more limited opt-out options than competitors. Using the API (for developers) or a paid enterprise plan offers stronger data protection. For sensitive work, the API with zero data retention is the appropriate choice.

Using AI Safely for Work

Even when using compliant enterprise plans, developing good habits protects you and your organisation:

Use placeholders for identifying information: Instead of "Draft an email to John Smith at ACME Corp about their £250,000 contract renewal," write: "Draft an email to [CLIENT NAME] at [COMPANY] about their [VALUE] contract renewal."

Replace identifying details with placeholders, enter the prompt, then substitute real names back in your word processor.

Use enterprise or API versions for professional work: For anything involving client data, internal business information, or regulated industries, use tools with explicit enterprise data commitments — Microsoft Copilot for Microsoft 365 users, Anthropic's API, Google Workspace Gemini — rather than consumer apps.

Know your organisation's policy: Many organisations have (or are developing) AI usage policies. Know what yours says before using AI tools for work. If no policy exists, advocate for one.

Key takeaway: Consumer AI tools are not designed to handle sensitive data. Use them for personal tasks, general learning, and non-sensitive work. For anything involving personal data, client information, or business confidentiality, use appropriate enterprise tools with explicit data commitments.

Privacy and AI: The Bigger Picture

Beyond individual data handling, AI raises broader privacy questions:

Surveillance: AI-powered facial recognition and monitoring capabilities have significant privacy implications, particularly in authoritarian contexts
Inference: AI can infer sensitive information (political views, health status, sexual orientation) from data that seems non-sensitive
Data aggregation: Combining many small pieces of individually innocuous information can reveal things people did not intend to disclose

These larger questions are being actively debated by regulators, researchers, and policymakers worldwide. The EU's AI Act, the UK's developing AI regulation framework, and US state-level laws are all grappling with these issues in real time.

Practice Task

This week, review the privacy settings on the AI tool you use most. Find the training data opt-out, understand what it does, and decide whether to opt out. Then audit whether you have ever entered information into a consumer AI tool that you would prefer to have kept private. This audit itself is a useful exercise — most people discover something surprising.

Privacy & AI