10 Things Not to Share With your AI Chatbot

If you take nothing else from this piece, take this: while these platforms function as powerful tools for students and enterprise teams alike, knowing exactly what not to Share With your AI Chatbot is the single most overlooked data security gap in modern organizations. Keep personal identifiers, credentials, source code, legal drafts, medical records, and unreleased strategy completely out of public chatbots. OpenAI’s official Privacy Policy confirms that user content may be used to improve models by default unless users explicitly turn this off in their account settings.

During a recent data security audit for a regional logistics firm, we discovered marketing staff pasting raw customer shipping manifests into ChatGPT to reformat email templates. No malice — just convenience. We anticipate that most organizations harbor similar blind spots because staff assume cloud AI operates like a closed, private notebook. It does not.

What Not to Share With Your AI Chatbot

1. Personal Identifiable Information (PII)

Never paste real names, national ID numbers, phone numbers, or full addresses into public chatbots. This especially includes government identifiers like a Social Security number, which can lead to immediate identity theft if exposed.

OpenAI’s official Privacy Policy confirms that conversations may be retained and reviewed even when users opt out of training data collection. A 2025 independent fact-check verified that deleted or temporary chats are removed only after a defined retention window — not instantly. When I red-teamed an internal helpdesk, I reconstructed partial customer identities from “anonymized” prompts that kept a first name, city, and problem description. That combination was enough to match records in their CRM.

2. Proprietary Source Code and Algorithms

Developers love pasting code snippets into chatbots for debugging help. That habit is a quiet IP leak.

If your code embodies a competitive advantage, treating a public chatbot as your primary debugger means exposing your core logic to a third-party data pipeline. When I ran a code-leak drill at a fintech client, engineers admitted they routinely pasted “just one tricky function.” That function included customer-scoring logic and live API endpoints. We anticipate that most engineering teams have at least one active shadow workflow feeding proprietary code into external AI systems.

3. API Keys, Secrets, and Credentials

Never share live API keys, cloud credentials, SSH keys, or database passwords with a chatbot — no exceptions.

Security reviews confirm that conversation logs can be accessed by provider staff during abuse investigations or incident response. If a credential lands in those logs, it becomes a shared secret across systems you do not control—an exposure that can completely bypass your Two-Factor Authentication defenses if an active API key or session token is involved. I once asked a dev team to grep their Git history for the prefix “sk-” (an OpenAI key prefix). They found dozens of historic leaks. Logically, this implies that people will also paste secrets into prompts unless hard guardrails are in place.

4. Protected Health Information (PHI)

Healthcare teams experimenting with AI for drafting clinical notes face a serious regulatory risk.

Generic consumer chat interfaces are not HIPAA-compliant environments, and pasting named patient histories into a public chatbot almost certainly moves PHI outside your formal compliance boundary. In one clinic security workshop I ran, a nurse admitted, “I just paste the text and ask it to make it sound kinder.” That text included medication histories and full patient names. We anticipate that enforcement agencies will sharpen their focus on AI-related PHI leaks over the next two years.

4. Financial Account Details

Bank account numbers, IBANs, card PANs, and internal financial projections do not belong in chat prompts.

Policy breakdowns show that even users who opt out of training still have prompts stored in provider logs for safety review windows. Logically, this implies that raw financial identifiers remain in retrievable logs for a period after each session. Attackers treat centralized high-value datasets as priority targets, and AI prompt logs are increasingly appearing in breach disclosures alongside traditional application databases.

6. Unreleased Corporate Strategy and Financials

Do not paste draft earnings reports, M&A decks, product roadmaps, or board minutes into public chatbots.

The NIST AI Risk Management Framework, published by the official U.S. National Institute of Standards and Technology, explicitly frames AI systems as risk-bearing components requiring the same governance discipline as all other information systems. Feeding forward-looking financials or deal discussions into third-party AI systems creates legal exposure, including insider trading and confidentiality violations. In one strategy workshop, I watched a founder paste a full investor update into a chatbot for “tone polishing” — weeks before the round was publicly announced.

7. Raw CRM Exports and Customer Lists

Sales and marketing teams frequently paste CRM rows into chatbots to “clean up” messaging.

Those rows typically include emails, phone numbers, purchase histories, and internal notes that combine into highly sensitive profile datasets stored outside your controlled environment. Once a customer dataset leaves your infrastructure via a chat prompt, you lose fine-grained control over who may access it and for how long. We usually find at least one active shadow workflow like this in every commercial team we audit.

8. HR Files, Complaints, and Performance Reviews

HR data is both emotionally and legally sensitive. It requires strict internal controls.

Performance reviews, salary bands, complaints, and disciplinary letters can trigger legal obligations around confidentiality and fairness. If those details are captured in an AI provider’s logs, more people than intended may gain access during safety reviews. I have watched HR leaders paste entire disciplinary letters into chatbots, asking for the tone to be softened — transferring a delicate internal situation into a third-party log with zero additional protection.

9. Classified or Controlled Government Data

If your organization handles classified information or Controlled Unclassified Information (CUI), treat all public AI as strictly off-limits.

NIST’s AI RMF documentation explicitly advises that sensitive government-related data must be handled only within properly authorized and controlled environments. Public LLM services are not designed or authorized to handle CUI in a CUI-compliant manner. I worked with a defense contractor where a junior analyst nearly pasted a sensitive incident report into a chatbot. We caught it during training. Without that training, it would have been an unreported violation.

Legal teams are tempted to use AI to summarize discovery, rewrite clauses, or brainstorm arguments.

Voluntarily sharing attorney–client communications with a third-party AI provider can jeopardize privilege protections, as the communication is no longer confidential solely between attorney and client. We anticipate that courts will sharpen their rulings on AI-involved privilege waiver over the next few years. Relying on favorable future interpretations is a fragile legal strategy.

The Local Sanitization Workaround

You do not need to ban AI entirely. You need a controlled buffer between staff and the public Internet.

Microsoft’s officially documented Presidio SDK is an open-source toolkit that detects and anonymizes PII in text, images, and structured data using named-entity recognition, regex engines, and rule-based logic. It provides analyzer and anonymizer modules you can integrate directly into internal scripts to auto-redact names, card numbers, SSNs, and custom-defined patterns before any data leaves your network.

We built a “prompt firewall” for one client: a lightweight internal web form that passes all text through Presidio plus custom regex rules before forwarding to any external LLM. Staff never talk to ChatGPT directly — they talk to the safe wrapper, which enforces redaction and maintains an internal audit log.

Transparent Limitations You Must Know

Presidio’s own maintainers explicitly warn in their official FAQ that automated detection cannot guarantee all sensitive information is identified, and recommend layering additional human review and controls on top. If a user rewrites sensitive information descriptively enough to bypass pattern recognizers — for example, identifying a person by role and department instead of name — that meaning can still leak.

Similarly, turning off ChatGPT’s training toggle changes how data feeds model improvement pipelines but does not instantly purge prompts from operational or safety-review storage. Court orders can also compel providers to preserve logs that would otherwise be deleted. Relying purely on settings toggles and employee goodwill is not an acceptable control for high-risk environments.

Expert Pro-Tip: Build a “Safe Prompt Library”

Instead of letting every employee improvise their own prompts, curate a pre-approved “Safe Prompt Library” of role-specific templates that never require pasting raw sensitive data.

From our audits, this single change dramatically reduces risky behavior—because people prefer copy-paste convenience over writing new prompts from scratch. Logically, this implies that if you hand them safe, useful templates tailored to their role while clarifying exactly what not to share with your AI chatbot, they are far less likely to invent dangerous ones on the fly. Anchor this initiative inside a broader AI risk governance program aligned to the NIST AI RMF.

Most Popular

More From Same Category