AI risks: security and exposure of sensitive data

Sending code to a cloud LLM potentially means sending your secrets, business logic and customer data to a third party. This risk is real and underestimated.

Using cloud AI tools in development creates a new data leak vector. When a developer pastes code into ChatGPT to ask for help, or when Copilot sends context to the server to generate a completion, potentially sensitive information leaves your security perimeter. API tokens, private keys, environment variables, proprietary business logic, test data containing PII — all of this can end up in prompts sent to a cloud provider.

Real incidents exist. Samsung banned ChatGPT internally after engineers shared proprietary source code. Developers have accidentally included AWS keys in Copilot prompts. Beyond accidents, there is the question of principle: do the terms of service of these tools allow the provider to use your data to fine-tune their models? The answer varies by provider and the third parties used.

Practical protection measures: use enterprise versions of tools (GitHub Copilot Enterprise, Claude for Work) that offer stronger confidentiality guarantees. Deploy local models (Ollama, CodeLlama) for sensitive projects. Train teams never to include sensitive data in prompts. Establish a clear usage policy: which tools, for which types of code, with what precautions. The best-kept secret is the one that never leaves your infrastructure.

Prefer Enterprise offerings with confidentiality guarantees
Explore local models for critical projects
Train teams on prompt best practices
Never include credentials or PII in a cloud prompt

AI risks: security and exposure of sensitive data

Have a project in mind?