Skip to main content

Overview

This guide covers GCP-specific integration for CrewAI Platform deployments on Google Kubernetes Engine (GKE). It focuses on configuring GCS for object storage, Artifact Registry (or registry-less Bucket Deployment) for crew image builds, Cloud SQL for PostgreSQL, and optionally Secret Manager — all authenticated via GKE Workload Identity Federation (no static keys).
This guide assumes you have:
  • A GKE cluster running Kubernetes 1.28+ with Workload Identity enabled
  • gcloud CLI and kubectl configured
  • Helm 3.10+ installed
  • Basic familiarity with GCP services (Cloud SQL, GCS, Artifact Registry)

Prerequisites

Before configuring CrewAI Platform, ensure these GCP components are in place: CrewAI Platform supports AMD64 (x86_64) Kubernetes worker nodes. ARM64 (aarch64) worker nodes are not currently supported. For full platform requirements, see the Requirements Guide.

Required GCP Infrastructure

ComponentDocumentation LinkNotes
GKE ClusterGKE Quickstart1.28+ with Workload Identity enabled
Workload IdentityWorkload Identity FederationMust be enabled at cluster and node pool level
VPC and SubnetsGKE Network OverviewPrivate cluster recommended

Required GCP APIs

Enable the following APIs in your project before proceeding:
gcloud services enable \
  container.googleapis.com \
  sqladmin.googleapis.com \
  storage-api.googleapis.com \
  artifactregistry.googleapis.com \
  secretmanager.googleapis.com \
  certificatemanager.googleapis.com \
  iam.googleapis.com
Do not proceed with CrewAI installation until these prerequisites are met. The Helm chart will fail to deploy without them.

Step 1: Create the GCP Service Account

All CrewAI workloads (web, worker, buildkit) share a single GCP Service Account (GSA) that is mapped to the Kubernetes ServiceAccount via Workload Identity. This GSA needs permissions for GCS, Artifact Registry, Cloud SQL, and optionally Secret Manager.
export GCP_PROJECT_ID="your-gcp-project"
export GCP_REGION="us-central1"
export GSA_NAME="crewai-platform"
export GKE_NAMESPACE="crewai"           # Kubernetes namespace for the Helm release
export KSA_NAME="crewai-sa"             # Kubernetes ServiceAccount (chart default)

# Create the GCP Service Account
gcloud iam service-accounts create $GSA_NAME \
  --display-name="CrewAI Platform" \
  --project=$GCP_PROJECT_ID

Step 2: Grant IAM Roles

Bind the required IAM roles to the service account. Each role maps to a specific CrewAI requirement:
RoleAssigned ToPurpose
roles/storage.objectAdminGSARead/write crew artifacts and uploads in GCS. Also covers image bucket read/write when using Bucket Deployment
roles/iam.serviceAccountTokenCreatorGSASign GCS URLs via IAM signBlob (required for Workload Identity)
roles/artifactregistry.writerGSAPush and pull crew container images (build pods via Workload Identity). Not needed if using Bucket Deployment
roles/cloudsql.clientGSAConnect to Cloud SQL via the Auth Proxy
roles/cloudsql.instanceUserGSAIAM-based database authentication (only if using autoIamAuthn: true)
roles/artifactregistry.readerNode pool GCE SAPull crew images from GAR (kubelet uses node-level credentials). Not needed if using Bucket Deployment
roles/secretmanager.secretAccessorGSARead secrets from Secret Manager (optional)
# GCS - object storage for crew artifacts and uploads
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# GCS - signed URL generation via IAM signBlob (required for Workload Identity)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/iam.serviceAccountTokenCreator"

# Artifact Registry - push/pull crew container images
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/artifactregistry.writer"

# Cloud SQL - database connectivity via Cloud SQL Auth Proxy
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.client"

# Cloud SQL - IAM database authentication (only if using autoIamAuthn: true)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.instanceUser"

# (Optional) Secret Manager - external secrets
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

# Node pool GCE SA - pull crew images from Artifact Registry
# The kubelet uses the node's compute SA (not Workload Identity) for image pulls
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GCE_SA}" \
  --role="roles/artifactregistry.reader"
For tighter security, scope roles/storage.objectAdmin and roles/artifactregistry.writer to specific resources using --condition flags or bucket/repo-level IAM instead of project-level bindings.The node pool GCE SA binding is separate from the GSA bindings. The platform injects a short-lived GAR token at deploy time for immediate pulls, but the node SA provides reliable fallback when crew pods are rescheduled after the token expires.

Step 3: Bind Workload Identity

Create the IAM policy bindings that allow the Kubernetes ServiceAccounts to impersonate the GCP Service Account. Two bindings are needed: one for the platform namespace (web, worker, buildkit daemon) and one for the crews namespace (build pods that push images to GAR):
# Platform namespace — used by web, worker, and buildkit daemon pods
gcloud iam service-accounts add-iam-policy-binding \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[${GKE_NAMESPACE}/${KSA_NAME}]"

# Crews namespace — used by build pods that push crew images to Artifact Registry
gcloud iam service-accounts add-iam-policy-binding \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[crewai-crews/default]"
After the initial Helm install (which creates the crewai-crews namespace automatically), annotate the default ServiceAccount in the crews namespace so build pods can authenticate to GAR via Workload Identity:
kubectl annotate serviceaccount default -n crewai-crews --overwrite \
  iam.gke.io/gcp-service-account=${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com
The [NAMESPACE/KSA_NAME] in the member string must exactly match the Kubernetes namespace and the ServiceAccount name. The chart creates crewai-sa by default when rbac.create: true. The crews namespace (crewai-crews by default) uses the default ServiceAccount for build pods.The kubectl annotate command must run after the first helm install because the Helm chart creates the crewai-crews namespace. If you need to annotate before installing, create the namespace manually first: kubectl create namespace crewai-crews.

Cloud SQL for PostgreSQL

CrewAI Platform requires PostgreSQL 16+ for production deployments.

Cloud SQL Instance Sizing

Minimum recommended specifications based on CrewAI workload characteristics:
Deployment SizeMachine TypevCPURAMStorage
Developmentdb-custom-2-409624 GiB50 GiB SSD
Small Productiondb-custom-2-16384216 GiB100 GiB SSD
Medium Productiondb-custom-4-32768432 GiB250 GiB SSD
Large Productiondb-custom-8-65536864 GiB500 GiB SSD

Create the Cloud SQL Instance

export SQL_INSTANCE="crewai-db"

gcloud sql instances create $SQL_INSTANCE \
  --database-version=POSTGRES_16 \
  --tier=db-custom-2-16384 \
  --region=$GCP_REGION \
  --storage-size=100GB \
  --storage-type=SSD \
  --storage-auto-increase \
  --network=default \
  --no-assign-ip

Create Databases and User

CrewAI requires three databases (primary, cable, and OAuth), plus an optional fourth for Wharf tracing:
# Create databases
gcloud sql databases create crewai_plus_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_cable_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_oauth_production --instance=$SQL_INSTANCE

# If using Wharf OTLP trace collector (wharf.enabled: true)
gcloud sql databases create wharf --instance=$SQL_INSTANCE
You can use either password-based or IAM-based authentication: Option A: Password-based authentication (simpler)
gcloud sql users create crewai \
  --instance=$SQL_INSTANCE \
  --password="YOUR_SECURE_PASSWORD"
Option B: IAM-based authentication (no static passwords) IAM authentication uses Workload Identity to authenticate the application to Cloud SQL via short-lived tokens. No DB_PASSWORD is needed. Step 1: Enable IAM authentication on the Cloud SQL instance:
gcloud sql instances patch $SQL_INSTANCE \
  --database-flags=cloudsql.iam_authentication=on
The cloudsql.iam_authentication flag is off by default. Without it, Cloud SQL rejects IAM tokens regardless of user or role configuration.
Step 2: Create the IAM database user:
gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
  --instance=$SQL_INSTANCE \
  --type=CLOUD_IAM_SERVICE_ACCOUNT
Step 3: Grant the GSA the Cloud SQL IAM login role:
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.instanceUser"
Step 4: Grant SQL privileges to the IAM user. Connect to the Cloud SQL instance using the postgres superuser (recommended) or an existing admin user. You can connect via gcloud sql connect or from inside a running pod:
# Option 1: Connect via gcloud (from your local machine)
gcloud sql connect $SQL_INSTANCE --user=postgres --database=postgres

# Option 2: Connect from inside a pod with the Cloud SQL Auth Proxy sidecar
kubectl exec -it -n $GKE_NAMESPACE deploy/crewai-web -c $(kubectl get deploy crewai-web -n $GKE_NAMESPACE -o jsonpath='{.spec.template.spec.containers[0].name}') -- \
  bash -c 'PGPASSWORD=YOUR_PASSWORD psql -h 127.0.0.1 -U postgres -d postgres'
Then run the following SQL commands (replace GSA_NAME@GCP_PROJECT_ID.iam with your actual IAM user, e.g., crewai-platform@jr-testing-487713.iam):
-- Grant connect + full privileges on each database
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
-- If using Wharf (wharf.enabled: true)
GRANT ALL PRIVILEGES ON DATABASE wharf TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Switch to primary database and grant schema/table/sequence access
\c crewai_plus_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Repeat for cable database
\c crewai_plus_cable_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Repeat for OAuth database
\c crewai_plus_oauth_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";
Migrating from password-based to IAM auth? If your databases already have tables created by a password-based user (e.g., crewai), you must run the GRANT ALL ON ALL TABLES and GRANT ALL ON ALL SEQUENCES commands above for each database. Without these grants, the IAM user will get permission denied for table schema_migrations errors during migrations. The ALTER DEFAULT PRIVILEGES statements only apply to future objects — they do not retroactively grant access to existing tables.
Step 5: Transfer table ownership to the IAM user. PostgreSQL DDL operations (e.g., ALTER TABLE, DROP COLUMN) require the executing user to be the owner of the table — GRANT ALL PRIVILEGES is not sufficient. If the databases were initially set up by a different user (e.g., postgres or a password-based crewai user), the IAM user will not own existing tables, and future upgrade migrations that modify table structure will fail. The pre-upgrade migration hook automatically detects this condition and will report the affected tables with the exact SQL to fix them. To prevent this from blocking your first upgrade, transfer ownership proactively. Connect to the database as the postgres superuser using the same host and port the application uses:
PGPASSWORD=<postgres-password> psql -h "$DB_HOST" -p "${DB_PORT:-5432}" -U postgres -d "$POSTGRES_DB"
Then run the following SQL (replace the IAM user with your actual value, e.g., crewai-platform@your-project.iam):
-- Transfer all tables and sequences in the primary database
\c crewai_plus_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;

-- Repeat for cable database
\c crewai_plus_cable_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;

-- Repeat for OAuth database
\c crewai_plus_oauth_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;
Any tables the IAM user creates going forward (via Rails migrations) are automatically owned by it, so this transfer only needs to be done once for pre-existing tables.
Use the postgres superuser for these commands. Only the current owner or a superuser can transfer table ownership. A non-superuser like crewai can run GRANT commands on objects it owns, but cannot run ALTER TABLE ... OWNER TO for tables it doesn’t own.
The Helm chart includes a built-in Cloud SQL Auth Proxy sidecar. When enabled, it runs alongside the web, worker, OAuth, Wharf, and job containers, authenticating via Workload Identity. The app connects to 127.0.0.1 through the proxy.
# Helm values
cloudSqlProxy:
  enabled: true
  instanceConnectionName: "your-project:us-central1:crewai-db"  # GCP_PROJECT_ID:GCP_REGION:INSTANCE
  port: 5432
  privateIp: true           # Use private IP (recommended for VPC-peered instances)
  autoIamAuthn: true         # IAM-based authentication (no password needed)
If you prefer password-based auth through the proxy, set autoIamAuthn: false and provide DB_PASSWORD via secrets.

Helm Database Configuration

For IAM authentication (autoIamAuthn: true):
postgres:
  enabled: false  # Disable built-in PostgreSQL

envVars:
  DB_HOST: "127.0.0.1"  # Cloud SQL Auth Proxy
  DB_PORT: "5432"
  DB_USER: "crewai-platform@your-project.iam"  # GSA email minus .gserviceaccount.com
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"
No DB_PASSWORD is needed — the Cloud SQL Auth Proxy handles authentication automatically via Workload Identity. For password-based authentication (autoIamAuthn: false):
postgres:
  enabled: false

envVars:
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"

secrets:
  DB_PASSWORD: "YOUR_SECURE_PASSWORD"

Google Cloud Storage for Object Storage

CrewAI Platform uses GCS for storing crew artifacts, tool outputs, and user uploads. With Workload Identity, no static credentials are needed.

Create GCS Bucket

export GCS_BUCKET="crewai-prod-storage"

gsutil mb -p $GCP_PROJECT_ID -l $GCP_REGION -b on gs://$GCS_BUCKET

# Enable versioning for data protection
gsutil versioning set on gs://$GCS_BUCKET

# Set lifecycle rule to clean up old versions (optional)
cat > /tmp/lifecycle.json << 'EOF'
{
  "rule": [{
    "action": {"type": "Delete"},
    "condition": {"numNewerVersions": 3, "isLive": false}
  }]
}
EOF
gsutil lifecycle set /tmp/lifecycle.json gs://$GCS_BUCKET

Helm Configuration for GCS

envVars:
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-gcp-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"
  # Optional: explicit GSA email for signed URLs (auto-detected from metadata server if blank)
  # GCS_SIGNING_EMAIL: "crewai-platform@your-gcp-project.iam.gserviceaccount.com"
No credentials or keyfile configuration is required. The google-cloud-storage gem uses Application Default Credentials (ADC), which are automatically provided by GKE Workload Identity.
GCS_IAM_SIGNING is required when using Workload Identity. Without it, GCS signed URL generation will fail because Workload Identity credentials don’t include a private key for local signing. When enabled, the platform uses the IAM signBlob API instead, which requires the roles/iam.serviceAccountTokenCreator role granted in Step 2.If GCS_SIGNING_EMAIL is left blank, the service account email is automatically detected from the GKE metadata server.

Artifact Registry for Container Images

CrewAI Platform requires Artifact Registry for storing crew automation container images. When users create and deploy crews, CrewAI builds container images and pushes them to your registry.

Repository Requirements

Critical Requirements:
  • Repository URI must end in /crewai-enterprise once the platform appends its suffix
  • Immutable tags must be disabled (CrewAI overwrites tags for crew versions)

Create Artifact Registry Repository

export AR_REPO="crewai"

gcloud artifacts repositories create $AR_REPO \
  --repository-format=docker \
  --location=$GCP_REGION \
  --description="CrewAI Platform crew images"

# Verify the repository URI
gcloud artifacts repositories describe $AR_REPO \
  --location=$GCP_REGION \
  --format='value(name)'
# Output: projects/GCP_PROJECT_ID/locations/GCP_REGION/repositories/crewai
The resulting registry host is GCP_REGION-docker.pkg.dev. For example: us-central1-docker.pkg.dev/your-project/crewai. Valid repository URIs (set in CREW_IMAGE_REGISTRY_OVERRIDE):
  • us-central1-docker.pkg.dev/your-project/crewai
  • us-docker.pkg.dev/your-project/crewai
  • europe-west1-docker.pkg.dev/your-project/crewai

Helm Configuration for Artifact Registry

envVars:
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
  # Note: /crewai-enterprise suffix is added automatically by CrewAI Platform
The Helm chart auto-detects the GAR host from CREW_IMAGE_REGISTRY_OVERRIDE when it matches the *-docker.pkg.dev pattern and sets GCP_ARTIFACT_REGISTRY_HOST automatically. BuildKit pods then obtain short-lived access tokens from the GKE metadata server to authenticate pushes.

How Crew Pods Pull Images from GAR

When a crew is deployed, the platform automatically injects a fresh short-lived GAR access token into the crew namespace’s image pull secret. This ensures crew pods can immediately pull their built images from Artifact Registry without any manual credential management. How it works:
  1. Build pods push images to GAR via Workload Identity (using the GSA’s roles/artifactregistry.writer)
  2. At deploy time, the platform fetches a fresh GAR access token from the GKE metadata server and merges it into the registry pull secret
  3. Crew pods use this enriched pull secret to pull their images from GAR
This mechanism is fully automatic — it activates whenever the Helm chart auto-detects a GAR registry from CREW_IMAGE_REGISTRY_OVERRIDE.

Enable GKE Node Image Pulls from GAR (Defense-in-Depth)

As a fallback for scenarios where a crew pod is rescheduled to a new node after the injected token has expired, the node pool’s compute service account must have roles/artifactregistry.reader. This is configured in Step 2: Grant IAM Roles.
This IAM binding is required for production reliability. Without it, crew pods may fail to pull images if they are rescheduled to a new node more than one hour after initial deployment.

Verifying Artifact Registry Access

Test that the Workload Identity binding is working:
# From a pod in the namespace
kubectl run -it --rm gcloud-test \
  --namespace=$GKE_NAMESPACE \
  --image=google/cloud-sdk:slim \
  --serviceaccount=$KSA_NAME \
  --restart=Never -- bash

# Inside the pod:
gcloud auth print-access-token  # Should succeed via Workload Identity
gcloud artifacts docker images list ${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai

Bucket Deployment — Registry-Less Image Delivery

Bucket Deployment is an alternative to Artifact Registry that eliminates the need for any container registry. Instead of pushing built crew images to a registry, the platform stores them as compressed OCI tarballs in a GCS bucket and loads them directly into the container runtime on each Kubernetes node before deployment.
Bucket Deployment and Artifact Registry are mutually exclusive for crew image delivery. Choose one approach based on your requirements. Both still use BuildKit to build images — the difference is where the built image is stored and how it reaches the nodes.

When to Use Bucket Deployment

ScenarioRecommended Approach
Standard GKE deploymentArtifact Registry
Customer policy prohibits container registriesBucket Deployment
Air-gapped or restricted network (no registry access)Bucket Deployment
Existing GCS infrastructure, want to avoid GAR setupBucket Deployment
Multi-region with global GAR replicationArtifact Registry

How Bucket Deployment Works

The workflow has four phases: 1. Build — When a crew is deployed, BuildKit builds the container image as usual, but instead of pushing it to a registry, it outputs an OCI tarball. The tarball is compressed with gzip and uploaded to a GCS bucket using a Workload Identity access token. 2. Preload — A temporary Kubernetes DaemonSet is created, placing one pod on every node in the cluster. Each pod downloads the tarball from GCS, decompresses it, and imports it into the node’s containerd runtime using ctr images import. This makes the image available to the kubelet as if it had been pulled from a registry. 3. Deploy — Once all nodes have the image loaded, the crew is deployed via Helm with imagePullPolicy: Never and no imagePullSecrets. The kubelet finds the image in its local store and starts the pod normally. 4. Cleanup — The preloader DaemonSet is deleted. The imported images remain in the node’s containerd store.
GCS Bucket                    Kubernetes Cluster
┌─────────────────────┐       ┌────────────────────────────────────┐
│ crew-images/        │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 1: ctr import → ✓ cached  │
│                     │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 2: ctr import → ✓ cached  │
│                     │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 3: ctr import → ✓ cached  │
│                     │       │                                    │
└─────────────────────┘       │  All nodes ready → Helm deploy     │
                              │  (imagePullPolicy: Never)          │
                              │  → Preloader DaemonSet deleted     │
                              └────────────────────────────────────┘
The preloader DaemonSet runs with privileged: true to access the node’s containerd socket and ctr binary. This elevated access is ephemeral — it only exists during the deployment process and is cleaned up immediately after the image is loaded. The crew pods themselves run with normal, unprivileged security contexts.

Prerequisites

Before enabling Bucket Deployment, ensure you have:
  1. GCS bucket — A dedicated bucket for storing crew image tarballs
  2. IAM permissions — The Workload Identity GSA must have roles/storage.objectAdmin on the image bucket (for both upload and download)
  3. Workload Identity — Already configured per Step 3
  4. BuildKit — Enabled in the Helm chart (buildkit.enabled: true)
If you already configured roles/storage.objectAdmin at the project level for GCS object storage (see Step 2), the same binding covers the image bucket. No additional IAM configuration is needed — skip to Create the Image Bucket.

Create the Image Bucket

Create a dedicated GCS bucket for crew image tarballs. This bucket is separate from the general-purpose GCS_BUCKET used for crew artifacts and uploads.
export IMAGE_BUCKET="crewai-crew-images"

gcloud storage buckets create gs://$IMAGE_BUCKET \
  --project=$GCP_PROJECT_ID \
  --location=$GCP_REGION \
  --uniform-bucket-level-access
If you scoped roles/storage.objectAdmin to the general-purpose bucket (rather than project-level), grant access to the image bucket separately:
gcloud storage buckets add-iam-policy-binding gs://$IMAGE_BUCKET \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"
Lifecycle policies are not recommended for the image bucket. Image tarballs are actively referenced by running crews and must remain available for node rescheduling or cluster scaling events. If you need cleanup, delete tarballs only after the corresponding crew has been undeployed.

Helm Configuration for Bucket Deployment

Set the following environment variables in your Helm values to enable Bucket Deployment:
envVars:
  # Switch the crew deployment provider to bucket mode
  PROVIDER: BUCKET_BUILDKIT_KUBERNETES

  # GCS bucket for storing crew image tarballs (required)
  IMAGE_BUCKET_NAME: "crewai-crew-images"

  # Path prefix inside the bucket (optional, default: "crew-images")
  IMAGE_BUCKET_PREFIX: "crew-images"

  # Image reference name — still required even without a registry.
  # Used as the image name when importing into containerd and
  # referencing in Kubernetes pod specs.
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
CREW_IMAGE_REGISTRY_OVERRIDE is still required even though no registry is used. This value serves as the image name/tag throughout the build, preload, and deploy pipeline. The image is tagged with this name when imported into containerd, and Kubernetes pods reference it in their image: field. Do not remove it.

Configuration Reference

Environment VariableRequiredDefaultDescription
PROVIDERYesBUILDKIT_KUBERNETESSet to BUCKET_BUILDKIT_KUBERNETES to enable bucket mode
IMAGE_BUCKET_NAMEYesGCS bucket name for image tarballs
IMAGE_BUCKET_PREFIXNocrew-imagesPath prefix (folder) inside the bucket
CREW_IMAGE_REGISTRY_OVERRIDEYesImage name reference (used as the containerd image tag)
CONTAINERD_SOCKET_PATHNo/run/containerd/containerd.sockPath to the containerd socket on nodes
CTR_HOST_PATHNo/usr/binHost directory containing the ctr binary
The CONTAINERD_SOCKET_PATH and CTR_HOST_PATH defaults are correct for standard GKE nodes (Container-Optimized OS and Ubuntu). Override them only if your cluster uses custom node images with non-standard containerd paths.

What Changes from Artifact Registry Mode

When switching from Artifact Registry to Bucket Deployment, the following Helm values change:
ValueArtifact RegistryBucket Deployment
envVars.PROVIDERBUILDKIT_KUBERNETES (default)BUCKET_BUILDKIT_KUBERNETES
envVars.IMAGE_BUCKET_NAMENot setYour GCS bucket name
envVars.IMAGE_BUCKET_PREFIXNot setcrew-images (or custom)
envVars.CREW_IMAGE_REGISTRY_OVERRIDEGAR path (images pushed here)Same value (used as image name only)
Artifact Registry IAM rolesRequiredNot needed
Node pool artifactregistry.readerRequiredNot needed
Crews namespace Workload IdentityRequired (build pods push to GAR)Required (build pods upload to GCS)

Verifying Bucket Deployment

After deploying a crew, verify the bucket deployment pipeline: 1. Check the image tarball was uploaded:
gcloud storage ls gs://$IMAGE_BUCKET/crew-images/
# Should list .tar.gz files for deployed crews
2. Check preloader DaemonSet (during deployment):
# The preloader is ephemeral — only visible during active deployments
kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader
3. Verify the image is loaded on nodes:
# SSH into a node and check containerd
gcloud compute ssh NODE_NAME --zone=ZONE -- \
  sudo ctr -n k8s.io images ls | grep crewai
4. Confirm crew pods are running with imagePullPolicy: Never:
kubectl get pods -n crewai-crews -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].imagePullPolicy}{"\n"}{end}'
# Should show "Never" for crew pods deployed via bucket mode

Secret Manager Integration (Optional)

GCP Secret Manager provides centralized secret management for CrewAI Platform.

Which Secrets to Store

Store in Secret Manager (sensitive, need rotation):
  • DB_PASSWORD - Database credentials (if not using IAM auth)
  • SECRET_KEY_BASE - Rails secret key
  • GITHUB_TOKEN - For private repository access
  • Auth provider secrets (ENTRA_ID_CLIENT_SECRET, OKTA_CLIENT_SECRET, etc.)
Keep in values.yaml (configuration, not secrets):
  • DB_HOST, DB_PORT, DB_USER, POSTGRES_DB
  • GCS_PROJECT_ID, GCS_BUCKET
  • APPLICATION_HOST, AUTH_PROVIDER

External Secrets Operator Setup

CrewAI uses External Secrets Operator (ESO) to sync secrets from Secret Manager to Kubernetes. Install ESO (if not already installed):
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets \
  external-secrets/external-secrets \
  --namespace external-secrets-operator \
  --create-namespace

Helm Configuration for Secret Manager

secretStore:
  enabled: true
  provider: "gcp"
  gcp:
    projectID: "your-gcp-project"
    auth:
      workloadIdentity:
        enabled: true
        clusterLocation: "us-central1"
        clusterName: "your-cluster-name"
        serviceAccount: "crewai-sa"

externalSecret:
  enabled: true
  secretStoreKind: SecretStore
  secretStore: "crewai-secret-store"
  secretPath: "crewai-platform"           # Secret name in Secret Manager
  databaseSecretPath: "crewai-db-password" # Separate secret for DB password

External Access (Gateway API or Ingress)

GKE provides built-in support for the Kubernetes Gateway API, which is the recommended way to expose services externally.
The NGINX Ingress Controller was retired in March 2026. For new GKE deployments, Gateway API is recommended over traditional Ingress resources. Existing Ingress configurations continue to work.
GKE ships with built-in GatewayClass resources — no additional controller installation is needed, but Gateway API support must be enabled on the cluster.

Enable Gateway API on GKE

gcloud container clusters update YOUR_CLUSTER_NAME \
  --gateway-api=standard \
  --region=$GCP_REGION \
  --project=$GCP_PROJECT_ID
Verify the GatewayClass resources are available (this may take a minute to propagate):
kubectl get gatewayclass
You should see the following classes:
GatewayClassTypeUse Case
gke-l7-global-external-managedGlobal external ALBMulti-region, CDN, global anycast IP
gke-l7-regional-external-managedRegional external ALBSingle-region, lower latency
gke-l7-rilbRegional internal ALBInternal-only access within VPC
If kubectl get gatewayclass returns “No resources found”, the Gateway API CRDs are not installed. The Helm chart will fail with no matches for kind "Gateway". Run the gcloud container clusters update command above and wait for it to complete before deploying.

Helm Configuration

# Gateway API configuration
gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-global-external-managed
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - name: crewai-tls
    - name: http
      protocol: HTTP
      port: 80

web:
  gateway:
    enabled: true
    hostnames:
      - "crewai.your-company.com"

# If OAuth is enabled (shared hostname with /oauthsvc prefix)
oauth:
  enabled: true
  gateway:
    enabled: true
    pathPrefix: "/oauthsvc"
If OAuth requires a dedicated hostname (e.g., because GKE does not support NGINX-style regex path rewriting), set pathPrefix: "/" and specify the hostname:
oauth:
  enabled: true
  gateway:
    enabled: true
    hostname: "oauth.your-company.com"
    pathPrefix: "/"
When using a dedicated OAuth hostname, add it to the Gateway TLS certificate (or create a separate certificate map entry) and update DNS to point to the same Gateway IP.
To use a GCP-managed certificate instead of a Kubernetes TLS secret, add the annotation:
gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-global-external-managed
  annotations:
    networking.gke.io/certmap: "crewai-cert-map"
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
To create the certificate map:
# Create a managed certificate
gcloud certificate-manager certificates create crewai-cert \
  --domains="crewai.your-company.com"

# Create a certificate map and entry
gcloud certificate-manager maps create crewai-cert-map
gcloud certificate-manager maps entries create crewai-cert-entry \
  --map=crewai-cert-map \
  --certificates=crewai-cert \
  --hostname="crewai.your-company.com"
If you already have a Gateway resource (e.g., shared across multiple apps), reference it instead of creating one:
gateway:
  enabled: true
  create: false
  name: "shared-gateway"
  namespace: "gateway-infra"

web:
  gateway:
    enabled: true
After deploying, get the load balancer IP:
kubectl get gateway -n $GKE_NAMESPACE
# Note the ADDRESS column — update your DNS record to point to it
The Helm chart automatically creates a GKE HealthCheckPolicy that configures the load balancer to use /health as the health check path. Without this, GKE’s health probes use the pod IP as the Host header, which Rails’ HostAuthorization middleware blocks — causing unconditional drop overload errors. This is handled automatically; no manual configuration is needed.

Option 2: GCE Ingress (Native GKE)

If you prefer traditional Ingress resources:
web:
  ingress:
    enabled: true
    className: gce
    host: "crewai.your-company.com"
    annotations:
      kubernetes.io/ingress.global-static-ip-name: "crewai-ip"
      networking.gke.io/managed-certificates: "crewai-cert"

Option 3: NGINX Ingress Controller (Deprecated)

The NGINX Ingress Controller was retired in March 2026. Consider migrating to Gateway API for new deployments.
web:
  ingress:
    enabled: true
    className: nginx
    host: "crewai.your-company.com"
    nginx:
      tls:
        enabled: true
        secretName: "crewai-tls"

Complete GCP Deployment Example

Here is a complete production configuration for GCP:
# values-gcp-production.yaml

# ServiceAccount with Workload Identity annotation
serviceAccount:
  annotations:
    iam.gke.io/gcp-service-account: crewai-platform@your-project.iam.gserviceaccount.com

# Namespace for crew workloads (default: crewai-crews)
crewNamespace: "crewai-crews"

# Image pull credentials — required for pulling platform images from images.crewai.com.
# When installing via direct Helm (not KOTS), Replicated proxy credentials must be
# provided here. The credentials are the same used for `helm registry login`.
image:
  registries:
    - host: "images.crewai.com"
      username: "your-email@company.com"
      password: "your-replicated-license-token"

# Disable internal services (use GCP managed services)
postgres:
  enabled: false

minio:
  enabled: false

# Cloud SQL Auth Proxy
cloudSqlProxy:
  enabled: true
  instanceConnectionName: "your-project:us-central1:crewai-db"
  port: 5432
  privateIp: true
  autoIamAuthn: false  # Set true if using IAM database authentication

envVars:
  # Database (via Cloud SQL Auth Proxy)
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"

  # GCS for object storage
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"

  # Artifact Registry for crew images
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"

  # Application
  APPLICATION_HOST: "crewai.your-company.com"
  AUTH_PROVIDER: "entra_id"
  RAILS_ENV: "production"
  RAILS_LOG_LEVEL: "info"

# DB_PASSWORD is not needed when using Cloud SQL IAM auth (autoIamAuthn: true).
# For password-based auth, set it in secrets:
# secrets:
#   DB_PASSWORD: "your-secure-password"

# Gateway API (recommended over NGINX Ingress)
gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-global-external-managed
  annotations:
    networking.gke.io/certmap: "crewai-cert-map"
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
    - name: http
      protocol: HTTP
      port: 80

# Web
web:
  replicaCount: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"
  gateway:
    enabled: true
    hostnames:
      - "crewai.your-company.com"

# Worker
worker:
  replicaCount: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"

# OAuth (if using built-in integrations)
oauth:
  enabled: true
  gateway:
    enabled: true
    pathPrefix: "/oauthsvc"

# BuildKit
buildkit:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      cpu: "500m"
      memory: "2Gi"
    limits:
      cpu: "4"
      memory: "8Gi"

# RBAC
rbac:
  create: true
The image.registries section is required when installing via direct Helm (helm install ... oci://registry.crewai.com/...). It provides credentials for pulling platform images (busybox, redis, buildkit, etc.) from the Replicated proxy at images.crewai.com. Use the same email and license token you used for helm registry login registry.crewai.com.When installing via Replicated KOTS, these credentials are injected automatically and image.registries is not needed.
Deploy:
helm install crewai-platform \
  oci://registry.crewai.com/crewai/stable/crewai-platform \
  --values values-gcp-production.yaml \
  --namespace crewai \
  --create-namespace
Post-install: After the first install, annotate the crews namespace ServiceAccount for Workload Identity (see Step 3):
kubectl annotate serviceaccount default -n crewai-crews --overwrite \
  iam.gke.io/gcp-service-account=crewai-platform@your-project.iam.gserviceaccount.com

Bucket Deployment Variant

To use Bucket Deployment instead of Artifact Registry, modify the envVars section in the example above:
envVars:
  # Database (via Cloud SQL Auth Proxy)
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai-platform@your-project.iam"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"

  # GCS for object storage
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"

  # Bucket Deployment (registry-less crew image delivery)
  PROVIDER: BUCKET_BUILDKIT_KUBERNETES
  IMAGE_BUCKET_NAME: "crewai-crew-images"
  IMAGE_BUCKET_PREFIX: "crew-images"

  # Image reference name (still required — used as containerd image tag)
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"

  # Application
  APPLICATION_HOST: "crewai.your-company.com"
  AUTH_PROVIDER: "entra_id"
  RAILS_ENV: "production"
  RAILS_LOG_LEVEL: "info"
When using Bucket Deployment, Artifact Registry IAM roles (roles/artifactregistry.writer, roles/artifactregistry.reader) are not needed. The roles/storage.objectAdmin role on the image bucket handles both upload (from build pods) and download (from preloader pods). The Crews namespace Workload Identity binding is still required — build pods use it to obtain GCS access tokens for uploading image tarballs.

Troubleshooting GCP-Specific Issues

Workload Identity Not Working

Symptoms: Pods get 403 Forbidden or could not retrieve default credentials errors. Verify Workload Identity is enabled on the cluster:
gcloud container clusters describe YOUR_CLUSTER \
  --region=$GCP_REGION \
  --format='value(workloadIdentityConfig.workloadPool)'
# Should output: GCP_PROJECT_ID.svc.id.goog
Check ServiceAccount annotation:
kubectl get serviceaccount $KSA_NAME -n $GKE_NAMESPACE -o yaml | grep gcp-service-account
# Should show: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com
Test from a pod:
kubectl run -it --rm wi-test \
  --namespace=$GKE_NAMESPACE \
  --image=google/cloud-sdk:slim \
  --serviceaccount=$KSA_NAME \
  --restart=Never -- \
  gcloud auth print-access-token
If this fails, verify the IAM binding:
gcloud iam service-accounts get-iam-policy \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com
# Should list the workloadIdentityUser binding for your KSA

GCS Signed URL Errors

Symptoms: Logs show Google::Cloud::Storage::SignedUrlUnavailable: Service account credentials 'issuer (client_email)' is missing when deploying crews. This happens when Workload Identity is used but IAM-based URL signing is not enabled. The google-cloud-storage gem cannot sign URLs without a private key; with Workload Identity, the IAM signBlob API must be used instead. Fix:
  1. Ensure GCS_IAM_SIGNING: "true" is set in envVars
  2. Grant the roles/iam.serviceAccountTokenCreator role to the GSA:
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/iam.serviceAccountTokenCreator"
  1. Redeploy the Helm chart

GCS Access Denied

Symptoms: Logs show Google::Cloud::PermissionDeniedError for storage operations.
# Verify the GSA has storage permissions
gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --format="table(bindings.role)"

# Test from pod
kubectl exec -it deploy/crewai-web -n $GKE_NAMESPACE -- \
  ruby -e "require 'google/cloud/storage'; puts Google::Cloud::Storage.new.buckets.map(&:name)"

Cloud SQL Connection Failures

Symptoms: Pods show connection refused on 127.0.0.1:5432. Check the Cloud SQL Proxy sidecar is running:
kubectl get pods -n $GKE_NAMESPACE -l app.kubernetes.io/component=web -o jsonpath='{.items[0].status.containerStatuses[*].name}'
# Should include: cloud-sql-proxy

kubectl logs deploy/crewai-web -n $GKE_NAMESPACE -c cloud-sql-proxy
Verify the instance connection name:
gcloud sql instances describe $SQL_INSTANCE --format='value(connectionName)'
# Should match cloudSqlProxy.instanceConnectionName in your values

Cloud SQL IAM Authentication Failures

Symptoms: fe_sendauth: no password supplied or issue connecting with your username/password. “no password supplied”autoIamAuthn is not enabled in your Helm values:
cloudSqlProxy:
  autoIamAuthn: true  # Must be true for IAM auth
“issue connecting with your username/password” — the IAM token is being injected but rejected. Check these in order:
  1. IAM authentication flag on the instance (most common miss — off by default):
gcloud sql instances describe $SQL_INSTANCE \
  --format='value(settings.databaseFlags)'
# Must include: cloudsql.iam_authentication=on

# If missing, enable it:
gcloud sql instances patch $SQL_INSTANCE \
  --database-flags=cloudsql.iam_authentication=on
  1. IAM database user exists:
gcloud sql users list --instance=$SQL_INSTANCE --format="table(name,type)"
# Must show: GSA_NAME@GCP_PROJECT_ID.iam   CLOUD_IAM_SERVICE_ACCOUNT

# If missing:
gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
  --instance=$SQL_INSTANCE \
  --type=CLOUD_IAM_SERVICE_ACCOUNT
  1. GSA has required IAM roles:
gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --format="table(bindings.role)"
# Must include: roles/cloudsql.client AND roles/cloudsql.instanceUser
  1. DB_USER is in IAM format (in Helm values):
envVars:
  DB_USER: "crewai-platform@your-project.iam"  # NOT "crewai"
  1. SQL privileges granted — the IAM user must have been granted access to the databases via GRANT ALL PRIVILEGES (see “Create Databases and User” section above)

Cloud SQL IAM: Table Ownership Errors During Upgrades

Symptoms: Rails migrations fail with PG::InsufficientPrivilege: ERROR: must be owner of table <table_name>. Root cause: PostgreSQL DDL operations (ALTER TABLE, DROP COLUMN, ADD INDEX, etc.) require the executing user to be the owner of the table. When using IAM authentication, the IAM database user typically does not own tables that were created by the postgres superuser or a previous password-based user. GRANT ALL PRIVILEGES grants read/write access but does not transfer ownership. Fix: Connect as the postgres superuser and transfer ownership of all tables, sequences, and views to the IAM user. See Step 5 of the IAM authentication setup for the exact SQL commands. Prevention: Run the ownership transfer SQL during initial IAM auth setup (Step 5), before the first Helm upgrade.

Artifact Registry Push Failures

Symptoms: Crew deployments fail with unauthorized or denied during image push. Verify the GSA has Artifact Registry write access:
gcloud artifacts repositories get-iam-policy $AR_REPO \
  --location=$GCP_REGION \
  --format="table(bindings.role, bindings.members)"
Check BuildKit pod logs:
# Find the most recent buildkit build pod
kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp

# Check logs for GAR auth
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep "GAR authentication"
Check Workload Identity annotation on the crews namespace default SA (build pods use this SA):
kubectl get serviceaccount default -n crewai-crews -o jsonpath='{.metadata.annotations}'
# Should include: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com
If missing, annotate it (see Step 3).

Artifact Registry Pull Failures (Crew Pods)

Symptoms: Crew pods in crewai-crews namespace show ImagePullBackOff or ErrImagePull with 403 Forbidden when pulling from *-docker.pkg.dev. This can have multiple causes: 1. Wrong pull secret name: The crew pod references a pull secret that doesn’t exist or has empty credentials.
# Check which secret the crew pod references
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.imagePullSecrets}'

# Verify the secret exists and has data
kubectl get secret SECRET_NAME -n crewai-crews -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
If the decoded config shows {"auths":{}} (empty), the Helm chart’s image.registries may not be configured. See the Complete Example. 2. Node pool compute SA lacks GAR read access:
# Verify node pool OAuth scopes (should include cloud-platform)
gcloud container node-pools describe default-pool \
  --cluster YOUR_CLUSTER \
  --region $GCP_REGION \
  --format='value(config.oauthScopes)'

# Check if the compute SA has artifactregistry.reader
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GCE_SA} AND bindings.role:roles/artifactregistry" \
  --format="table(bindings.role)"
If the role is missing, grant it (see Step 2). 3. Verify the image exists in GAR:
gcloud artifacts docker images list \
  ${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai/crewai-enterprise \
  --include-tags

Build Pod Image Pull Failures

Symptoms: Build pods in the platform namespace show ErrImagePull for images from images.crewai.com. This means the registry pull secret lacks Replicated proxy credentials. Verify:
# Check the pull secret contents
kubectl get secret $(kubectl get pod POD_NAME -n $GKE_NAMESPACE -o jsonpath='{.spec.imagePullSecrets[0].name}') \
  -n $GKE_NAMESPACE -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | python3 -m json.tool
If images.crewai.com is not in the auths section, add Replicated proxy credentials to image.registries in your Helm values and redeploy (see the Complete Example).

Secret Manager Access Denied

Symptoms: ExternalSecret shows SecretSyncedError.
kubectl get externalsecret -n $GKE_NAMESPACE
kubectl describe externalsecret crewai-external-secret -n $GKE_NAMESPACE

kubectl get secretstore -n $GKE_NAMESPACE
kubectl describe secretstore crewai-secret-store -n $GKE_NAMESPACE

Bucket Deployment: Upload Failed

Symptoms: Crew deployment fails during the build phase with Bucket upload failed in the build pod logs. Check the build pod logs:
kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep -A5 "bucket"
Common causes:
  1. Missing IMAGE_BUCKET_NAME — The environment variable is not set. Verify it appears in your Helm values under envVars.
  2. Bucket does not exist — Verify the bucket exists:
gcloud storage buckets describe gs://$IMAGE_BUCKET
  1. Missing IAM permissions — The Workload Identity GSA needs roles/storage.objectAdmin on the image bucket:
gcloud storage buckets get-iam-policy gs://$IMAGE_BUCKET \
  --format="table(bindings.role, bindings.members)" \
  | grep $GSA_NAME
  1. Metadata server unreachable — Build pods obtain access tokens from the GCE metadata server. If the logs show Failed to get access token from metadata server, verify that Workload Identity is bound for the crews namespace (see Step 3).

Bucket Deployment: Preloader Timeout

Symptoms: Crew deployment hangs at the preload phase. The preloader DaemonSet pods are not reaching Ready state. Check the preloader DaemonSet and pod status:
kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader
kubectl get pods -n $GKE_NAMESPACE -l app=image-preloader -o wide
kubectl logs -n $GKE_NAMESPACE -l app=image-preloader --tail=50
Common causes:
  1. Image tarball not found in bucket — The preloader downloads from the bucket. If the build phase failed silently or the object name doesn’t match:
gcloud storage ls gs://$IMAGE_BUCKET/crew-images/
  1. GCS download permission denied — The preloader pods run in the platform namespace and use the platform KSA’s Workload Identity. Verify the GSA has read access to the image bucket.
  2. containerd socket not accessible — The preloader mounts the host’s containerd socket. If the node uses a non-standard socket path, set CONTAINERD_SOCKET_PATH in your Helm values. Check the default path:
# SSH into a node
gcloud compute ssh NODE_NAME --zone=ZONE -- ls -la /run/containerd/containerd.sock
  1. ctr binary not found — The preloader uses the host’s ctr binary from /usr/bin. If it’s in a different location on your nodes, set CTR_HOST_PATH in your Helm values:
gcloud compute ssh NODE_NAME --zone=ZONE -- which ctr

Bucket Deployment: ErrImageNeverPull

Symptoms: Crew pods show ErrImageNeverPull status after deployment. This means Kubernetes is set to imagePullPolicy: Never but the image is not present in the node’s containerd store. This can happen if:
  1. Preloader didn’t complete on this node — The node may have been added to the cluster after the preloader DaemonSet ran. Redeploy the crew to trigger a fresh preload cycle.
  2. containerd image was garbage collected — containerd may have cleaned up unused images. Redeploy the crew.
  3. Image name mismatch — The CREW_IMAGE_REGISTRY_OVERRIDE value must be consistent across build, preload, and deploy. Verify:
# Check what image the pod is trying to use
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.containers[0].image}'

# Check what images are loaded on the node
gcloud compute ssh NODE_NAME --zone=ZONE -- \
  sudo ctr -n k8s.io images ls | grep crewai
The image name from the pod spec must exactly match an image in the ctr output.