We scanned 20 AI repos for leaked keys. Every scanner alert was a false positive.

getdebug ships a secret scanner as part of its free tier — committed credentials are the one finding category we surface without an account, because the cost of a leaked key is high enough that even a 30-second check is worth running. So we did the obvious thing: we ran our scanner against 20 public AI-starter repos on GitHub, expecting to find some real leaks. The premise was that someone in a corpus of mid-popularity AI scaffolds must have committed a real OpenAI key.

Every single scanner alert was a false positive.

The numbers

Across the 20-repo sweep, our scanner produced 12 alerts at critical severity. Zero of them were real credentials. Two repos accounted for most of the noise:

stackitcloud/rag-template — 7 scanner alerts, all false positives. Every hit was a placeholder value in a .env.template file (e.g. STACKIT_VLLM_API_KEY=your-stackit-vllm-api-key) or an import.meta.env.X env-var name read. None of them were real credentials.
A popular Claude Code starter template — 5 scanner alerts, all false positives. Three were "Private key block" matches inside CHANGELOG.md and SNAPSHOT.md showing PEM-formatted example output. The other two were the funniest: PEM markers appearing in comments next to grep patterns and redaction regexes that exist to strip the same shape. Secret detectors tripping on secret-detector code.

What we shipped because of it

A false-positive rate that high on a corpus this small is a real problem. So we read every hit, classified the failure modes, and shipped three detector rules into both the CLI (@getdebug/cli) and the hosted analyze worker:

Broader env-template matching. Any file whose path or extension matches .env.template, .env.example, .env.sample, or a parent directory named examples/ is treated as template by default. Findings inside still surface, but at info severity, not critical.
Doc-context suppression. Hits inside fenced code blocks in markdown, or under headings like "Example output" / "Sample response", no longer trip critical severity. The detector still records them — they just don't page anyone.
Env-var-read skip in entropy. The entropy-based detector now recognizes process.env.X, import.meta.env.X, and os.environ["X"] as identifier reads, not opaque high-entropy strings. The variable name being long and random-looking doesn't make the access a leak.

Re-running the same 20-repo sweep after these three rules landed: 83% reduction in critical false positives. Two FPs remain — both in the same Claude Code starter template — and they need a fourth rule we haven't shipped yet (PEM-in-comment suppression, which requires cross-language comment parsing). The post is upfront about that: it's an open detector gap, not a closed one.

Why publish this

Every security vendor's landing page claims a low false positive rate. Almost nobody shows the work. We'd rather be the team that publishes its scanner being wrong, ships fixes in public, and re-runs the numbers — because that's what we'd want from a tool we were thinking of buying.

The corpus is reproducible. The methodology, the per-tool numbers, and the JSON output schema live under /bench. If you find a case the scanner gets wrong on your own code, the issue tracker on github.com/getdebug-ai/cli is the right place to put it. Detector tuning is still early, and the easiest way to improve it is to point us at the noise.

Try it

The secret-scanning detectors that came out of this sweep are in the free tier of the CLI. No account needed; nothing leaves your laptop.

# macOS / Linux
brew install getdebug-ai/tap/getdebug
getdebug analyze .

Full install instructions and the rest of the commands are in the docs.