The single most predictable AI security failure is also the most preventable. Engineers paste configuration files, database dumps, customer emails, and API responses into prompts because that is the fastest way to ask a question. Training people not to is a losing strategy.
What runs in the harness
A DLP interceptor sits between the user and the model. On every prompt:
1. Pattern detection — API keys (Stripe, AWS, OpenAI, GitHub), PII (SSN, PAN, email, phone), and any custom classifiers your security team has loaded. 2. Inline redaction — matched tokens are replaced with redaction markers in the outbound prompt. 3. Audit capture — the original prompt, the redacted prompt, and the diff are stored in the audit log under the user's identity. 4. Reversibility (optional) — for workflows that need it, the harness can keep a tokenized mapping so model output can be de-redacted on the way back, scoped to the original user.
Why edge redaction beats post-hoc detection
Most data-loss tools look at logs after the call. By then the secret is in the model provider's logs, possibly their training pipeline, and definitely outside your tenant. Edge redaction means the secret never leaves your environment.
Custom classifiers
Your security team can ship YARA-style rules into the policy bundle — anything from "internal codename" patterns to fingerprinted PHI formats. The rules are signed with the rest of the policy; the harness will not start if they have been tampered with.