Responding to a Secret Leakage in Git History

This post covers how to respond when a secret ends up in git history. I'm not going to describe the specific credentials involved or which system they accessed, but the process I'm describing here is what we actually worked through. The decisions matter more than the specifics.

How It Was Found

Automated secret scanning on a pull request flagged a credential pattern — a string that matched a known format for a specific type of service credential. The scanner was running as part of our CI pipeline, and it flagged a file that appeared to include an old configuration block that someone had pasted in for reference and never removed.

Pulling on that thread: the credential had been introduced in a commit several months prior. It wasn't in the most recent commit — it was in history, which means it had been quietly sitting there through dozens of subsequent commits, visible to anyone who cloned the repo or had access to the git log.

The first question in the room: is the repo public or private? Private, access-controlled repository. That's better than public, but it doesn't mean the risk is zero. Anyone with repo access has access to the full history. That includes service accounts used by CI/CD systems, which often have broad access and may have their tokens visible in other places. You cannot assume a private repo means a contained secret.

Rotate First, Investigate Second

This is the single most important decision in a secret leakage incident: rotate the credential before you do anything else. Before the investigation, before the git cleanup, before you call the postmortem meeting.

The logic is simple. The secret is already compromised — it has been for however long it's been in history. The only thing that stops the bleeding is rotation. Every minute you spend investigating before rotating is a minute the credential is still valid and usable.

The order matters practically as well. Once you rotate the credential, the window of exposure for that specific secret closes. Now you're investigating a historical exposure, not an active one. That's a fundamentally different posture.

Rotation process: find every system that uses the credential, update them with the new secret, verify they're functioning, then confirm the old credential is invalidated. Document when the new credential was issued.

Assessing the Exposure

After rotation, reconstruct the timeline:

When was the commit that introduced the credential? (git log --all --full-history -- <file> to find the commit, then git show COMMIT_HASH to verify)
What was the earliest date the credential was accessible in history?
Who has access to the repo? Pull the full access list — humans and service accounts.
Check audit logs for the service that used the credential. Were there any accesses that don't correspond to known legitimate use? Focus on the time window from when the credential was committed to when it was rotated. Unusual authentication patterns, access from unexpected source IPs, or access at unusual hours are worth investigating.

In our case, audit logs showed no anomalous access patterns. That's reassuring but not conclusive — a sophisticated attacker doesn't necessarily make noise in logs, and some systems have audit logs that aren't complete. We documented the log review and its findings and moved on.

Cleaning the Git History

Once you've rotated and assessed, clean the git history. Two tools:

git filter-branch is the built-in approach. It's slow, the syntax is painful, and it rewrites every commit in the history that touched the affected file. For anything but tiny repos, don't use it.

BFG Repo-Cleaner is what you actually want. It's a Java tool that does the same thing faster and with a simpler interface. To remove a specific string (the credential) from all history:

# Create a file with the secrets to remove, one per line
echo "ACTUAL_SECRET_VALUE" > secrets.txt

# Run BFG against a bare clone of the repo
git clone --mirror git@github.com:org/repo.git repo-mirror.git
java -jar bfg.jar --replace-text secrets.txt repo-mirror.git

# Expire old refs and run gc to actually remove the data
cd repo-mirror.git
git reflog expire --expire=now --all
git gc --prune=now --aggressive

# Push the cleaned history
git push --force

The --force push rewrites the remote history. This is one of the few situations where force-pushing to main is the correct action. It's also destructive — anyone who has cloned the repo has the old history on their machine, and when they try to pull, their local git will be confused because the remote history no longer matches.

This is the hard operational part: you need to notify every person and every system that has cloned the repository, tell them what happened (at a minimum: the history has been rewritten, they need to re-clone), and verify that CI/CD systems, developer machines, and any other systems that may have cached the old history have been refreshed. If someone with the old clone pushes, the leaked credential comes back.

We sent a team-wide notification, temporarily locked the repo to prevent pushes while we coordinated, and tracked re-clones through a brief required acknowledgment step in our internal tooling. Overkill for a small team, necessary for a large one.

Prevention: GitHub Secret Scanning and Pre-Commit Hooks

GitHub's push protection can block secrets at push time — it recognizes common credential formats and rejects the push before it lands. If you're on GitHub, enable this. It won't help with the credential that's already in history, but it prevents the next one. Go enable it now if you haven't.

The more reliable prevention layer is a pre-commit hook. We standardized detect-secrets across all repos after this incident:

pip install detect-secrets pre-commit

# Add to .pre-commit-config.yaml:
repos:
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

Initialize the baseline (which documents known false positives that are acceptable):

detect-secrets scan > .secrets.baseline

Commit .secrets.baseline to the repo. From that point, any new string that matches a credential pattern blocks the commit. The developer sees the error and has to either remove the secret or explicitly add it to the baseline with a justification.

The key to making this stick: enforce it. Add a CI check that runs detect-secrets scan --baseline .secrets.baseline and fails the build if new secrets are found. Pre-commit hooks can be bypassed locally with --no-verify. The CI check cannot.

Process Changes After the Incident

We made three process changes:

Secret rotation schedule. All long-lived credentials now have a rotation date in a tracking spreadsheet. Quarterly rotation for most credentials, annually for low-risk ones, and a documented owner for each. The rotation schedule is reviewed monthly.

Documented remediation playbook. The exact steps above, written down and stored somewhere that's accessible if the primary communication systems are unavailable (our git repos, our Slack, our CI/CD system could all theoretically be the thing that's compromised). We keep a copy in Confluence and on a shared drive.

PR checklist item. One line added to our PR template: "Does this PR contain any credentials, API keys, tokens, or private keys?" It's not sophisticated, but it's a prompt that catches the case where someone added a credential as a "temporary" config value and was planning to remove it before merging but forgot.

The Postmortem

We ran a blameless postmortem two weeks after the incident. The goal was systemic fixes, not attribution. It doesn't matter who committed the secret — the system should have caught it before it merged, and it didn't.

The three systemic gaps we identified:

No automated secret scanning on push. The CI pipeline ran tests and linters but had no credential scanning.
No enforced pre-commit hook. Several engineers had detect-secrets installed locally but it wasn't standardized or enforced.
No documented playbook. When the incident happened, we spent the first 30 minutes figuring out the right order of operations. That time cost should be zero in a real incident.

All three were fixed within two weeks of the postmortem. The fixes themselves are not complicated — the gap was that nobody had prioritized setting them up before an incident demonstrated why they were necessary.

The thing I'd emphasize to other teams: secret scanning is not expensive to set up and the cost of not having it is paid exactly once, under the worst conditions. Set it up before you need it.