What is gitleaks and how does it differ from TruffleHog?

Gitleaks is an open-source secrets scanner for git repositories, detecting hardcoded credentials using regex rules and entropy analysis. The key difference from TruffleHog (Demo D1): gitleaks is dramatically faster (13ms vs 2s for the same repo) and uses lightweight rule matching, while TruffleHog has 700+ specialized detectors and can verify secrets via live API calls. Gitleaks excels at CI/CD pre-commit scanning where speed matters. TruffleHog excels at deep retrospective audits where coverage matters. SecurityClaw runs both — they catch different things.

If I deleted a file with git rm, is the secret gone?

No. git rm removes the file from the working tree and staging area, but the content is permanently embedded in the commit where it was first added. Every clone of the repository has a full copy of git history, including that commit. The secret is accessible to anyone who has ever cloned the repo, anyone who clones it in the future, and any scanner that reads git history rather than just the current file tree. The only remediation is to rotate the secret immediately AND rewrite git history using git filter-branch or BFG Repo-Cleaner. Both operations require force-pushing and coordinating with anyone who has a clone.

What is entropy analysis in secret detection?

Entropy is a measure of randomness in a string, expressed in bits per character. Normal English text has entropy around 3.5-4.5 bits/char. Secrets — API keys, tokens, passwords — have higher entropy because they're designed to be random and unpredictable. Base64-encoded data has characteristic entropy around 5.7 bits/char; hex-encoded secrets around 4.0 bits/char. Gitleaks combines entropy analysis with variable name context — a high-entropy string assigned to a variable named API_KEY or SECRET is flagged. This is how gitleaks caught our base64-encoded key even though it had no recognizable pattern: the combination of high entropy and context triggered the rule.

Should I use gitleaks or TruffleHog?

Use both, for different purposes. Gitleaks in your pre-commit hook or CI/CD pipeline — it's fast enough to not slow down your development workflow (13ms for this demo repo). TruffleHog for your initial retrospective audit of a codebase, and for any finding that needs live verification (TruffleHog can confirm whether a found AWS key is still active). SecurityClaw's dual-scanner workflow runs gitleaks first for speed, then TruffleHog for depth. Together they close most gaps. For YAML config file passwords not caught by either tool's default rules, add a custom .gitleaks.toml rule to your repo.

How do I add gitleaks to my CI/CD pipeline?

The fastest integration is a pre-commit hook: gitleaks protect --staged runs in milliseconds and blocks commits containing secrets before they ever reach the remote. For GitHub Actions, gitleaks provides a first-party action (gitleaks/gitleaks-action) that scans on every push and pull request. For GitLab CI, add gitleaks detect --source . --exit-code 1 as a security stage job. The --exit-code 1 flag fails the pipeline on any finding. For repos with existing history, run gitleaks detect --source . first to audit the full commit log before setting up the pre-commit hook.

What does gitleaks miss, and how do you close the gap?

Gitleaks's default ruleset (approximately 150 rules) misses secrets that don't match a known pattern and lack entropy signals. In our demo, the AWS Secret Access Key (a random 40-character string with no prefix like AKIA) was missed — there's no reliable pattern to detect it without context. PostgreSQL passwords in YAML config files were also missed — gitleaks doesn't have a default rule for password: value in YAML. Fixes: for AWS, use TruffleHog's verified AWS detector which checks both key and secret together. For YAML passwords, write a custom .gitleaks.toml rule targeting common config file patterns. Both are documented workarounds, not hard limitations.

We Deleted the Key. Gitleaks Found It Anyway. Here's Why.

March 12, 2026 · SecurityClaw Demo D13 · Supply Chain Security

A developer panics. They committed a private key by mistake. They delete the file, push the commit, breathe a sigh of relief. They were never safe. We built exactly this scenario — 10 commits, 8 planted secrets, one RSA key deleted in the next commit after it was added — and ran SecurityClaw + Gitleaks against it. 13.2 milliseconds. 9 findings. 6 secrets found — including the one that "didn't exist anymore."

Date:March 12, 2026

Tool:gitleaks v8.18.0

Category:Supply Chain Security

Result:✅ Pass — 6/8 unique secrets

Scan time:13.2 milliseconds

Methodology note: We created a controlled git repo simulating a real developer workflow — 10 commits, 8 secrets planted across different files and scenarios. The repo includes realistic commit messages and file structures. All secrets were planted by us; all scan results are real. This is Demo D13 in the SecurityClaw campaign series. See also Demo D1 (TruffleHog) — gitleaks and TruffleHog are complementary tools, not alternatives.

The Setup

The demo repo had 10 commits simulating a real small-team development workflow: initial setup, adding configuration files, committing credentials "temporarily", trying to clean up, and pushing code that still carried secrets in its history.

We planted 8 secrets covering the most common real-world exposure categories:

Secret Type	File	Special Factor
AWS Access Key ID	`config/aws.yml`	Classic credential exposure
AWS Secret Access Key	`config/aws.yml`	Paired with Access Key ID above
PostgreSQL password	`config/database.yml`	YAML config pattern
GitHub PAT / JWT secret	`src/auth/config.py`	Dual-use credential
Slack Webhook URL	`config/notifications.json`	Dedicated detection rule
RSA Private Key	`keys/deploy_key.pem`	Deleted in the next commit
Stripe Secret Key	`src/payments.py`	Multi-rule match
Base64-encoded API key	`.env`	Obfuscated — entropy catch

The scan command:

gitleaks detect --source . --report-format json --report-path results.json --verbose

Runtime: 13.2 milliseconds. 9 raw findings. 6 unique secrets identified.

What Gitleaks Found

#	Secret Type	Found?	Rule	Why Notable
1	AWS Access Key ID	✅ YES	`aws-access-token`	`AKIA...` prefix triggers dedicated rule
2	AWS Secret Access Key	❌ MISSED	—	No rule for random 40-char strings
3	PostgreSQL password (YAML)	❌ MISSED	—	`password:` in YAML not in default rules
4	GitHub PAT / JWT secret	✅ YES	`generic-api-key`	Entropy + variable name context
5	Slack Webhook URL	✅ YES	`slack-webhook-url`	Dedicated Slack rule fires immediately
6	RSA Private Key (DELETED)	✅ YES	`private-key`	Found in git history — file gone from working tree
7	Stripe Secret Key	✅ YES	`stripe-access-token`	3 raw findings = 1 secret (multi-rule match)
8	Base64-encoded API key	✅ YES	`generic-api-key`	Entropy analysis saw through the obfuscation

Score: 6/8 unique secrets. 0 false positives (1 low-risk OAuth client ID flagged for triage).

The Two Moments That Matter

Moment 1: The Deleted Key Was Not Gone

The RSA private key was committed in commit 8172da4 — the developer added it to the deploy configuration. In the very next commit, a182f88, they deleted it with the message: "Remove deploy key from repo (oops, was committed by mistake)."

The file does not exist in the current working tree. ls keys/ returns nothing. git status shows a clean repository.

Gitleaks found the key in commit 8172da4's diff. It's in the history. It will always be in the history. Every past and future clone of this repo has it.

The lesson is not subtle: git rm does not remove secrets. git push already made them permanent.

The only remediation is to do two things simultaneously:

Rotate the key immediately — assume it's compromised. It has been accessible to anyone with repo access since the original push.
Rewrite git history — using git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch keys/deploy_key.pem' or the faster BFG Repo-Cleaner. This requires a force-push and coordination with every team member who has a clone.

Most teams only do step 1. Step 2 is operationally painful. Which means the commit history stays poisoned, and any future clone pulls the deleted-but-present secret along with it.

Moment 2: Base64 Is Not Encryption

One .env entry was encoded:

INTERNAL_API_KEY=aW50ZXJuYWwtc2VjcmV0LWFwaS1rZXktMTIzNDU2Nzg5MA==

The developer had added a comment: "This is base64-encoded but still a secret (obfuscation != encryption)" — demonstrating self-awareness that the encoding was cosmetic. Gitleaks caught it anyway via entropy analysis. Base64 has characteristic entropy around 5.7 bits per character — clearly distinct from natural language (3.5–4.5 bits/char) or even hex strings (4.0 bits/char). Combined with the variable name INTERNAL_API_KEY, the rule fired.

No pattern match needed. The obfuscation was transparent to entropy analysis. This matters because developers sometimes think encoding a secret reduces detection risk. It doesn't. It often increases it, because encoded secrets have higher entropy than plaintext ones.

The Honest Gaps

Two secrets were missed. Both are genuine limitations of gitleaks's default ruleset:

AWS Secret Access Key — No Pattern Rule

The AWS Key ID (AKIA... prefix) was caught by a dedicated rule. The AWS Secret Access Key — a random 40-character alphanumeric string — was not. There's no distinctive pattern in the secret itself. Gitleaks has no default rule for it because random 40-character strings appear throughout codebases as hashes, tokens, and other non-secret identifiers — a rule would generate enormous false positives.

TruffleHog's verified AWS detector handles this by checking both the key ID and secret together and verifying them against AWS's API. It's slower but catches what gitleaks misses here.

PostgreSQL Password in YAML — Not in Default Rules

A plaintext password in a YAML config file (password: Sup3rS3cr3t123) was not flagged. Gitleaks's default rules don't cover the password: pattern in YAML because it generates false positives on placeholder values and documentation examples.

The fix: add a custom rule to a .gitleaks.toml committed to your repo:

[[rules]]
id = "yaml-password"
description = "Password in YAML config file"
regex = '''(?i)password\s*:\s*.{8,}'''
path = '''.*\.(yml|yaml)$'''
entropy = 3.5

This rule combines pattern matching with a minimum entropy threshold to reduce false positives on placeholder text like "changeme" or "your_password_here".

Gitleaks vs. TruffleHog: When to Use Each

D1 was TruffleHog. D13 is gitleaks. They're not competing — they're complementary. SecurityClaw runs both because they catch different secrets in different ways.

Feature	TruffleHog (D1)	Gitleaks (D13)
Speed (same repo)	~2 seconds	13.2ms
Detectors	700+ specialized	~150 rules
Live secret verification	Yes — checks if key still active	No
Custom rules	Config-based	.toml file
Git history scan	Yes	Yes
Entropy detection	Limited	Strong
CI/CD pipeline fit	Retrospective audits	Pre-commit hooks
D1/D13 result	4/5 planted secrets	6/8 planted secrets

SecurityClaw's dual-scanner workflow: run gitleaks first (13ms, catches high-entropy and pattern-based secrets, perfect for CI gates). Run TruffleHog for deeper retrospective audits where live verification matters. Add custom rules to your .gitleaks.toml for organisation-specific credential patterns.

For teams building security into their development workflow, the concepts underlying these tools — entropy analysis, regex-based detection, git internals — are covered well in Black Hat Python. And The Web Application Hacker's Handbook covers credential exposure in the broader context of web application assessment — where git history secrets often lead directly to application-layer compromise.

SecurityClaw Scorecard: D13

D13 adds a second campaign to SecurityClaw's supply-chain-security category. Together with D1 (TruffleHog) and D9 (supply-chain-scanner), the supply chain coverage now spans three complementary approaches: package integrity, malicious behaviour detection, and secrets in version control history.

Metric	Value
Tool	gitleaks v8.18.0
Target	10-commit controlled git repo
Secrets planted	8 (across 6 file types)
Secrets found	6/8 unique (75%)
Scan time	13.2ms
False positives	0 (1 low-risk OAuth ID flagged for triage)
Campaign result	PASS

If your git history has never been scanned for secrets, the question isn't whether there are exposed credentials — it's how many and how old the exposure is. gitleaks detect --source . --verbose answers that question in under 30 seconds for most repositories. It's free, it's fast, and it reads history that your developers thought was safely deleted.

Also worth reading alongside this demo: Penetration Testing by Georgia Weidman, which covers credential discovery as part of a complete assessment methodology — context for understanding where git history scanning fits in a real engagement.