Every day, millions of people paste emails, code snippets, internal documents, and personal information into ChatGPT, Gemini, and Claude without thinking twice.
Most of them don’t realize that data can be stored, logged, and potentially used for model training.
In March 2023, Samsung engineers accidentally leaked proprietary source code through ChatGPT — three separate incidents in under a month. Samsung ended up banning ChatGPT company-wide. They weren’t the last. Italy temporarily banned ChatGPT over data privacy concerns. Apple, Amazon, JPMorgan, and dozens of other companies followed with their own internal restrictions.
The reality is simple: if you’re pasting sensitive data into an AI chatbot, you’re trusting that provider with your data. And most of us are doing it without even noticing.
Here’s how to actually protect yourself.
1. Understand What Gets Stored
When you send a message to ChatGPT, that data is transmitted to OpenAI’s servers. By default:
• OpenAI may use your conversations to improve their models
• Your data is stored on their servers
• Employees may review conversations for safety and quality
You can opt out of training by going to Settings → Data Controls → Improve the model for everyone and toggling it off. But your data is still being transmitted and stored — you’re just opting out of it being used for training specifically.
Gemini and Claude have similar policies with slightly different defaults. The point is the same: once you hit send, that data is no longer just yours.
2. Know What Counts as Sensitive Data
Most people think of “sensitive data” as passwords or credit card numbers. But in practice, the things that leak into AI chatbots are much more mundane — and much more common:
• Names of clients, coworkers, or patients
• Email addresses and phone numbers
• Internal URLs and IP addresses
• API keys and tokens buried in code snippets
• Physical addresses in documents you’re summarizing
• Financial data in spreadsheets you’re asking the AI to analyze
• Medical or legal information in documents you’re drafting
You’d be surprised how much of this ends up in a prompt without you noticing.
3. Use ChatGPT’s Built-In Privacy Settings
ChatGPT offers a few settings that help, but most people never touch them.
Disable chat history:
Settings → Data Controls → Chat history & training → Off
This prevents your conversations from being used for training and stops them from appearing in your sidebar. Note: OpenAI still retains the data for 30 days for abuse monitoring.
Use Temporary Chats:
Click your profile → Temporary Chat. These conversations aren’t saved to your history and aren’t used for training.
Use the API instead of the web interface:
If you’re a developer, using the API gives you more control. API data is not used for training by default, and retention policies are stricter.
These settings help, but they all rely on the same assumption: you trust OpenAI with your data after it leaves your browser.
4. Manually Review Before You Send
The simplest approach: read your prompt before you hit send.
Ask yourself:
• Are there any real names in this?
• Are there email addresses, phone numbers, or physical addresses?
• Is there anything in this code snippet that’s proprietary — API keys, internal URLs, database credentials?
• If this prompt was made public, would it be a problem?
This works, but it’s tedious and error-prone. When you’re moving fast — which is the whole point of using AI — you skip this step. That’s when leaks happen.
5. Redact Sensitive Data Before It Reaches the AI
The most effective approach is to catch sensitive data before it ever leaves your browser.
This is what tools like Prompt Armour do. It’s a browser extension that sits inside ChatGPT, Gemini, and Claude and scans your input in real-time as you type or paste. When it detects sensitive data — names, emails, API keys, phone numbers, addresses — it highlights them and replaces them with anonymous tokens before the message is sent.
For example:
• “John Smith” → [NAME_1]
• “john@company.com” → [EMAIL_1]
• “AKIA3EXAMPLE7KEY” → [AWS_KEY_1]
The detection runs entirely in your browser using pattern matching, entropy analysis, and lightweight NLP. No data is sent to any server. The replacements are reversible — you can see exactly what was redacted and undo it if needed.
The key difference from manual review: it catches things you’d miss, and it works automatically every time.
(Full disclosure: I built Prompt Armour, so I’m biased. But I built it because I kept leaking my own data into ChatGPT and got tired of it.)
6. Use Enterprise Versions When Available
If you’re using AI chatbots for work, push for the enterprise tiers:
• ChatGPT Enterprise / Team — data is not used for training, SOC 2 compliant, admin controls
• Claude for Business — similar enterprise data protections
• Gemini for Workspace — integrates with Google’s enterprise data policies
These are more expensive but give you contractual data protections that the free tiers don’t.
7. Establish a Personal (or Team) Policy
Even if you’re just one person, having a simple rule helps:
Never paste raw data. Always sanitize first.
If you’re on a team, formalize it:
• Define what counts as sensitive for your organization
• Require redaction before any AI chatbot usage
• Use tools that enforce this automatically
• Log or audit AI usage if you’re in a regulated industry
The companies that got burned — Samsung, etc. — didn’t have policies in place until after the leak. Don’t wait for your incident.
The Bottom Line
AI chatbots are incredibly useful. The answer isn’t to stop using them — it’s to use them without giving away data you can’t take back.
The combination that works best:
1. Turn off training data sharing in your settings
2. Use a redaction tool that catches sensitive data automatically
3. Do a quick manual check on anything high-stakes
4. Push for enterprise tiers if you’re using AI at work
The best time to think about AI privacy was before your first prompt. The second best time is now.