Privacy & AI
Lesson 8.3 — Privacy and AI: What Happens to Your Data
Every time you type a message into an AI tool, something happens to that text. It travels across the internet, is processed by servers in data centres, may be stored, may be reviewed by humans, and in some cases is used to train future versions of the model. Most people have not read the privacy policies of the tools they use every day. This lesson gives you what you actually need to know.
What Happens When You Type a Prompt
When you submit a message to an AI tool, the journey typically looks like this:
- Your text is transmitted to the company's servers, usually encrypted in transit (this is standard and expected)
- The model processes your input and generates a response
- Your conversation may be stored — how long, and for what purpose, varies by provider
- Human reviewers may read conversations as part of safety review and quality improvement programmes
- Your data may be used to train future models — unless you opt out, where opt-out is available
The key variables across platforms are: how long data is stored, whether humans review it, and whether you can opt out of training data use.
Platform Data Handling Comparison
| Platform | Default storage | Human review | Training data use | Opt-out available |
|---|---|---|---|---|
| ChatGPT (free) | Conversations stored; used for training by default | Yes, by human reviewers | Yes, by default | Yes — in Settings → Data Controls → Improve model |
| ChatGPT (Plus/Team) | Stored; training opt-out available | Yes, with reduced review | Opt-out available | Yes |
| ChatGPT (Enterprise/API) | Not used for training by default | No by default | No by default | N/A — off by default |
| Claude (claude.ai free) | Stored; may be used for training | Possible | May be used | Limited |
| Claude (API / Claude.ai Pro) | Anthropic commits to not using for training by default | Limited | No | N/A — off by default |
| Grok (free / grok.com) | Conversations may be used to improve xAI models | Possible | By default | Limited — check account settings |
| Grok (X Premium) | Same as free; X's privacy policy also applies | Possible | By default | Via X account settings |
| Gemini (Google, free) | Stored; reviewed by human raters | Yes | Yes, by default | Yes — in Google Account → Gemini Apps Activity |
| Gemini Advanced | Stored; less training use | Reduced | Limited | Available |
| Microsoft Copilot (consumer) | Microsoft privacy policy applies | Possible | Yes | Via privacy settings |
| Microsoft Copilot (enterprise) | Enterprise data commitments apply | No | No | N/A |
Note: Privacy policies change. Before relying on these details for important decisions, check the current policy directly on each platform.
What You Should Never Type into Public AI Tools
Regardless of platform, certain categories of information should never be entered into consumer AI tools:
Personal data of others:
- Full names + addresses + financial information of customers or clients
- Medical records or health information about specific individuals
- Employee personal data
Confidential business information:
- Unreleased product plans, revenue figures, or strategic plans
- Client names combined with sensitive contract details
- Proprietary source code or trade secrets
Your own sensitive data:
- Passwords, API keys, or authentication credentials
- Full financial account details
- Information that could identify you in combination (e.g. full name + date of birth + medical condition)
Legal matter details:
- Ongoing legal disputes with identifying information
- Attorney-client communications
The rule of thumb: if this text appeared in a data breach, would there be consequences? If yes, keep it out of consumer AI tools.
The Samsung Incident
In April 2023, Samsung engineers were discovered to have entered sensitive internal code and meeting minutes into ChatGPT on three separate occasions within a 20-day period. One engineer pasted source code for a semiconductor production tool to ask for optimisation help. Another uploaded meeting notes. A third asked the AI to convert meeting notes into a presentation.
Samsung had granted employees access to ChatGPT for legitimate work purposes but had not yet established clear guidance about what could be shared. The code and meeting content became part of OpenAI's training data.
Samsung subsequently restricted internal use of external AI tools and began developing internal AI systems. Multiple other large companies — Apple, JPMorgan, Amazon — issued similar restrictions around the same time.
What this illustrates: The risk is not theoretical. Employees making reasonable-seeming decisions in the course of work can inadvertently share highly sensitive data. The solution is policy and education, not just technology.
How to Opt Out of Training Data Use
ChatGPT
- Log into chat.openai.com
- Click your profile picture → Settings
- Go to Data Controls
- Toggle off "Improve the model for everyone"
Once opted out, your future conversations will not be used for training. Past conversations already submitted cannot be retroactively withdrawn.
Google Gemini
- Log into your Google Account
- Go to myaccount.google.com
- Navigate to Data & Privacy → Web & App Activity
- Find Gemini Apps Activity and turn it off
Or directly at gemini.google.com → Settings → Gemini Apps Activity
Claude (Anthropic)
Anthropic's consumer product has more limited opt-out options than competitors. Using the API (for developers) or a paid enterprise plan offers stronger data protection. For sensitive work, the API with zero data retention is the appropriate choice.
Using AI Safely for Work
Even when using compliant enterprise plans, developing good habits protects you and your organisation:
Use placeholders for identifying information: Instead of "Draft an email to John Smith at ACME Corp about their £250,000 contract renewal," write: "Draft an email to [CLIENT NAME] at [COMPANY] about their [VALUE] contract renewal."
Replace identifying details with placeholders, enter the prompt, then substitute real names back in your word processor.
Use enterprise or API versions for professional work: For anything involving client data, internal business information, or regulated industries, use tools with explicit enterprise data commitments — Microsoft Copilot for Microsoft 365 users, Anthropic's API, Google Workspace Gemini — rather than consumer apps.
Know your organisation's policy: Many organisations have (or are developing) AI usage policies. Know what yours says before using AI tools for work. If no policy exists, advocate for one.
Key takeaway: Consumer AI tools are not designed to handle sensitive data. Use them for personal tasks, general learning, and non-sensitive work. For anything involving personal data, client information, or business confidentiality, use appropriate enterprise tools with explicit data commitments.
Privacy and AI: The Bigger Picture
Beyond individual data handling, AI raises broader privacy questions:
- Surveillance: AI-powered facial recognition and monitoring capabilities have significant privacy implications, particularly in authoritarian contexts
- Inference: AI can infer sensitive information (political views, health status, sexual orientation) from data that seems non-sensitive
- Data aggregation: Combining many small pieces of individually innocuous information can reveal things people did not intend to disclose
These larger questions are being actively debated by regulators, researchers, and policymakers worldwide. The EU's AI Act, the UK's developing AI regulation framework, and US state-level laws are all grappling with these issues in real time.
Practice Task
This week, review the privacy settings on the AI tool you use most. Find the training data opt-out, understand what it does, and decide whether to opt out. Then audit whether you have ever entered information into a consumer AI tool that you would prefer to have kept private. This audit itself is a useful exercise — most people discover something surprising.