Migrating Container Images from GCR to Artifact Registry

Google announced the deprecation of Container Registry (GCR) in May 2023, with the transition to Artifact Registry as the replacement. We had container image references scattered across dozens of repositories — Dockerfiles, Kubernetes manifests, Helm values files, CI/CD pipelines. Getting all of them updated before the deadline was a coordination and auditing problem more than a technical one. Here's how we handled it.

Why GCR Was Being Replaced

GCR (gcr.io, us.gcr.io, eu.gcr.io, asia.gcr.io) was a container-only registry. Artifact Registry is a unified artifact storage service that supports container images, Maven packages, npm packages, Python packages, Go modules, and Debian/RPM packages. It's the strategic direction for GCP artifact storage.

From an operations perspective, Artifact Registry also adds features that GCR didn't have: per-repository IAM policies, repository-level cleanup policies for automatic deletion of old images, vulnerability scanning at the repository level, and CMEK support for customer-managed encryption keys.

The Redirect Situation

Starting in 2023, GCR domains redirect to Artifact Registry — gcr.io/my-project/my-image transparently serves from an automatically created Artifact Registry repository. This redirect will not last indefinitely, and it only applies to images that already existed in GCR. New images pushed to gcr.io after the transition are being rejected.

The redirect buys time, but it's not a migration strategy. We needed explicit pkg.dev references everywhere.

Creating an Artifact Registry Repository

For each repository we needed, we created it explicitly via Terraform:

resource "google_artifact_registry_repository" "containers" {
  location      = "us-central1"
  repository_id = "containers"
  description   = "Primary container image repository"
  format        = "DOCKER"
  project       = var.project_id

  docker_config {
    immutable_tags = false
  }

  cleanup_policy_dry_run = false

  labels = {
    managed-by = "terraform"
    team       = "platform"
  }
}

We created one repository per environment (dev, staging, prod) rather than one monolithic repository. This let us set different retention policies — dev images get cleaned up aggressively, prod images have longer retention.

Updating References

The new Artifact Registry URL format is:

REGION-docker.pkg.dev/PROJECT/REPOSITORY/IMAGE:TAG

Compared to GCR:

gcr.io/PROJECT/IMAGE:TAG
us.gcr.io/PROJECT/IMAGE:TAG

We needed to update references in:

Dockerfiles — FROM gcr.io/my-project/base-image:latest becomes FROM us-central1-docker.pkg.dev/my-project/containers/base-image:latest

Kubernetes manifests — image: gcr.io/my-project/my-service:abc123 becomes image: us-central1-docker.pkg.dev/my-project/containers/my-service:abc123

Helm values — same substitution in any image values

CI/CD pipelines — GitHub Actions workflow steps, Cloud Build configs, anywhere images are built or referenced

To audit the full scope before making changes, I wrote a script to find all references across the GitHub org:

#!/bin/bash
# Find all gcr.io references in the org
gh repo list MY_ORG --limit 200 --json name -q '.[].name' | while read repo; do
  results=$(gh api "repos/MY_ORG/$repo/git/trees/HEAD?recursive=1" \
    --jq '.tree[] | select(.type=="blob") | .path' 2>/dev/null | \
    grep -E '\.(yaml|yml|json|tf|Dockerfile|txt)$')

  for file in $results; do
    content=$(gh api "repos/MY_ORG/$repo/contents/$file" --jq '.content' 2>/dev/null | base64 -d 2>/dev/null)
    if echo "$content" | grep -q 'gcr\.io'; then
      echo "$repo/$file"
    fi
  done
done

This gave us a list of every file across every repository that contained a gcr.io reference. The output was longer than I expected — 340+ files across 60+ repositories.

Authentication

For local development, configure Docker to authenticate to Artifact Registry:

gcloud auth configure-docker us-central1-docker.pkg.dev

This adds a credential helper to your Docker config that uses your active gcloud credentials for us-central1-docker.pkg.dev requests.

For GKE workloads, the node pool service account needs the appropriate IAM role:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:${NODE_POOL_SA}@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/artifactregistry.reader"

If you're using Workload Identity, grant the role to the GCP service account that your Kubernetes service accounts are bound to. GKE nodes using the default compute service account already have Artifact Registry read access within the same project — but explicit grants are cleaner than relying on defaults.

For cross-project access (reading images from a central registry project into application projects), the IAM binding goes on the specific Artifact Registry repository:

gcloud artifacts repositories add-iam-policy-binding containers \
  --location=us-central1 \
  --project=my-registry-project \
  --member="serviceAccount:my-app-sa@my-app-project.iam.gserviceaccount.com" \
  --role="roles/artifactregistry.reader"

Cleanup Policies

One of the better features Artifact Registry adds over GCR is cleanup policies — automatic deletion of old images based on tag patterns, age, or count. We set these up for dev and staging immediately:

# Keep only the most recent 20 untagged images, delete untagged images older than 7 days
gcloud artifacts repositories set-cleanup-policies containers \
  --project=my-dev-project \
  --location=us-central1 \
  --policy='[
    {
      "name": "delete-old-untagged",
      "action": {"type": "Delete"},
      "condition": {
        "tagState": "UNTAGGED",
        "olderThan": "604800s"
      }
    },
    {
      "name": "keep-recent-tagged",
      "action": {"type": "Keep"},
      "condition": {
        "tagState": "TAGGED",
        "newerThan": "2592000s"
      }
    }
  ]'

Before this we were paying for storage on years of accumulated dev image layers. The first cleanup run reclaimed several hundred GB.

The Actual Cutover

We ran the migration in three waves, roughly grouped by team. Each wave:

Create PRs to update all image references in that team's repositories (scripted substitution, then manual review)
Update CI/CD pipelines to push to Artifact Registry
Deploy the updated manifests to staging first
Watch a full deploy cycle to confirm image pulls succeed
Merge the production changes

The main source of errors during migration was images that existed in GCR but hadn't been rebuilt yet — you'd update the manifest to point to pkg.dev but the image hadn't been pushed there yet, causing pull failures. Solution: always push to Artifact Registry first, then update the reference.

We also had a few places where image tags were hardcoded as SHAs from GCR builds — those images didn't exist in Artifact Registry at all and needed to be identified and rebuilt. The audit script caught most of these but not all; some turned up during staging deploys.

Total timeline: about six weeks from starting the audit to having the last production reference updated. The actual work was maybe two weeks spread across the team; the rest was coordination time and waiting for deployment cycles.