GCP Integration Guide - CrewAI Platform Helm Chart

Overview

This guide covers GCP-specific integration for CrewAI Platform deployments on Google Kubernetes Engine (GKE). It focuses on configuring GCS for object storage, Artifact Registry (or registry-less Bucket Deployment) for crew image builds, Cloud SQL for PostgreSQL, and optionally Secret Manager — all authenticated via GKE Workload Identity Federation (no static keys).

This guide assumes you have:

A GKE cluster running Kubernetes 1.28+ with Workload Identity enabled
gcloud CLI and kubectl configured
Helm 3.10+ installed
Basic familiarity with GCP services (Cloud SQL, GCS, Artifact Registry)

Prerequisites

Before configuring CrewAI Platform, ensure these GCP components are in place: CrewAI Platform supports AMD64 (x86_64) Kubernetes worker nodes. ARM64 (aarch64) worker nodes are not currently supported. For full platform requirements, see the Requirements Guide.

Required GCP Infrastructure

Component	Documentation Link	Notes
GKE Cluster	GKE Quickstart	1.28+ with Workload Identity enabled
Workload Identity	Workload Identity Federation	Must be enabled at cluster and node pool level
VPC and Subnets	GKE Network Overview	Private cluster recommended

Required GCP APIs

Enable the following APIs in your project before proceeding:

gcloud services enable \
  container.googleapis.com \
  sqladmin.googleapis.com \
  storage-api.googleapis.com \
  artifactregistry.googleapis.com \
  secretmanager.googleapis.com \
  certificatemanager.googleapis.com \
  iam.googleapis.com

Do not proceed with CrewAI installation until these prerequisites are met. The Helm chart will fail to deploy without them.

Pre-Install Checklist

Before running helm install, confirm the following are complete:

GKE cluster with Workload Identity enabled
Gateway API enabled (gcloud container clusters update --gateway-api=standard)
All four Cloud SQL databases created: crewai_plus_production, crewai_plus_cable_production, crewai_plus_oauth_production, wharf
SQL privileges granted to IAM user on all four databases
GSA IAM roles bound (roles/cloudsql.client, roles/cloudsql.instanceUser, roles/iam.workloadIdentityUser, roles/storage.objectAdmin)
GCS_IAM_SIGNING: "true" set in envVars (required when using Workload Identity — omitting this causes Google::Cloud::Storage::SignedUrlUnavailable at runtime)

Step 1: Create the GCP Service Account

All CrewAI workloads (web, worker, buildkit) share a single GCP Service Account (GSA) that is mapped to the Kubernetes ServiceAccount via Workload Identity. This GSA needs permissions for GCS, Artifact Registry, Cloud SQL, and optionally Secret Manager.

export GCP_PROJECT_ID="your-gcp-project"
export GCP_REGION="us-central1"
export GSA_NAME="crewai-platform"
export GKE_NAMESPACE="crewai"           # Kubernetes namespace for the Helm release
export KSA_NAME="crewai-sa"             # Chart default is {release-name}-sa. This guide uses release name 'crewai', producing 'crewai-sa'. If your release name differs, either set this variable to match your actual SA name or pin serviceAccount.name: crewai-sa in your Helm values.

# Create the GCP Service Account
gcloud iam service-accounts create $GSA_NAME \
  --display-name="CrewAI Platform" \
  --project=$GCP_PROJECT_ID

The Workload Identity IAM binding in Step 3 references KSA_NAME. If your helm install command uses a different release name than the one in this guide, your KSA_NAME will not match the ServiceAccount the chart creates. To avoid a silent Workload Identity failure, add serviceAccount.name: crewai-sa to your Helm values to force the chart to create a ServiceAccount with the name this guide assumes, OR update the gcloud iam binding command to use your actual {release-name}-sa value.

Step 2: Grant IAM Roles

Bind the required IAM roles to the service account. Each role maps to a specific CrewAI requirement:

Role	Assigned To	Purpose
`roles/storage.objectAdmin`	GSA	Read/write crew artifacts and uploads in GCS. Also covers image bucket read/write when using Bucket Deployment
`roles/iam.serviceAccountTokenCreator`	GSA	Sign GCS URLs via IAM `signBlob` (required for Workload Identity)
`roles/artifactregistry.writer`	GSA	Push and pull crew container images (build pods via Workload Identity). Not needed if using Bucket Deployment
`roles/cloudsql.client`	GSA	Connect to Cloud SQL via the Auth Proxy
`roles/cloudsql.instanceUser`	GSA	IAM-based database authentication (only if using `autoIamAuthn: true`)
`roles/artifactregistry.reader`	Node pool GCE SA	Pull crew images from GAR (kubelet uses node-level credentials). Not needed if using Bucket Deployment
`roles/secretmanager.secretAccessor`	GSA	Read secrets from Secret Manager (optional)

# GCS - object storage for crew artifacts and uploads
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# GCS - signed URL generation via IAM signBlob (required for Workload Identity)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/iam.serviceAccountTokenCreator"

# Artifact Registry - push/pull crew container images
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/artifactregistry.writer"

# Cloud SQL - database connectivity via Cloud SQL Auth Proxy
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.client"

# Cloud SQL - IAM database authentication (only if using autoIamAuthn: true)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.instanceUser"

# (Optional) Secret Manager - external secrets
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

# Node pool GCE SA - pull crew images from Artifact Registry
# The kubelet uses the node's compute SA (not Workload Identity) for image pulls
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GCE_SA}" \
  --role="roles/artifactregistry.reader"

For tighter security, scope roles/storage.objectAdmin and roles/artifactregistry.writer to specific resources using --condition flags or bucket/repo-level IAM instead of project-level bindings.The node pool GCE SA binding is separate from the GSA bindings. The platform injects a short-lived GAR token at deploy time for immediate pulls, but the node SA provides reliable fallback when crew pods are rescheduled after the token expires.

Step 3: Bind Workload Identity

Create the IAM policy bindings that allow the Kubernetes ServiceAccounts to impersonate the GCP Service Account. Two bindings are needed: one for the platform namespace (web, worker, buildkit daemon) and one for the crews namespace (build pods that push images to GAR):

# Platform namespace — used by web, worker, and buildkit daemon pods
gcloud iam service-accounts add-iam-policy-binding \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[${GKE_NAMESPACE}/${KSA_NAME}]"

# Crews namespace — used by build pods that push crew images to Artifact Registry
gcloud iam service-accounts add-iam-policy-binding \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[crewai-crews/default]"

The [NAMESPACE/KSA_NAME] in the member string must exactly match the Kubernetes namespace and the ServiceAccount name. The chart creates crewai-sa by default when rbac.create: true. The crews namespace (crewai-crews by default) uses the default ServiceAccount for build pods.

Post-Install: Annotate the Crew Namespace

After the initial Helm install (which creates the crewai-crews namespace automatically), annotate the default ServiceAccount in the crews namespace so build pods can authenticate to GAR via Workload Identity:

kubectl annotate serviceaccount default -n crewai-crews --overwrite \
  iam.gke.io/gcp-service-account=${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com

Until this annotation is applied, all crew image builds will fail with a permission denied error when pushing to Artifact Registry. If deploying via CI/CD and you need to apply the annotation before helm install, create the namespace in advance: kubectl create namespace crewai-crews, then annotate immediately.

ArgoCD users: This step requires a two-Application pattern because the crewai-crews namespace does not exist until after the first sync completes. See the ArgoCD Deployment Guide.

Cloud SQL for PostgreSQL

CrewAI Platform requires PostgreSQL 16+ for production deployments.

Cloud SQL Instance Sizing

Minimum recommended specifications based on CrewAI workload characteristics:

Deployment Size	Machine Type	vCPU	RAM	Storage
Development	db-custom-2-4096	2	4 GiB	50 GiB SSD
Small Production	db-custom-2-16384	2	16 GiB	100 GiB SSD
Medium Production	db-custom-4-32768	4	32 GiB	250 GiB SSD
Large Production	db-custom-8-65536	8	64 GiB	500 GiB SSD

Create the Cloud SQL Instance

export SQL_INSTANCE="crewai-db"

gcloud sql instances create $SQL_INSTANCE \
  --database-version=POSTGRES_16 \
  --tier=db-custom-2-16384 \
  --region=$GCP_REGION \
  --storage-size=100GB \
  --storage-type=SSD \
  --storage-auto-increase \
  --network=default \
  --no-assign-ip

Create Databases and User

CrewAI requires four databases — primary, cable, OAuth, and Wharf for OTLP trace storage:

# Create databases
gcloud sql databases create crewai_plus_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_cable_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_oauth_production --instance=$SQL_INSTANCE
gcloud sql databases create wharf --instance=$SQL_INSTANCE

You can use either password-based or IAM-based authentication: Option A: Password-based authentication (simpler)

gcloud sql users create crewai \
  --instance=$SQL_INSTANCE \
  --password="YOUR_SECURE_PASSWORD"

Option B: IAM-based authentication (no static passwords) IAM authentication uses Workload Identity to authenticate the application to Cloud SQL via short-lived tokens. No DB_PASSWORD is needed. Step 1: Enable IAM authentication on the Cloud SQL instance:

gcloud sql instances patch $SQL_INSTANCE \
  --database-flags=cloudsql.iam_authentication=on

The cloudsql.iam_authentication flag is off by default. Without it, Cloud SQL rejects IAM tokens regardless of user or role configuration.

Step 2: Create the IAM database user:

gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
  --instance=$SQL_INSTANCE \
  --type=CLOUD_IAM_SERVICE_ACCOUNT

Step 3: Grant the GSA the Cloud SQL IAM login role:

gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/cloudsql.instanceUser"

Step 4: Grant SQL privileges to the IAM user. Connect to the Cloud SQL instance using the postgres superuser (recommended) or an existing admin user. You can connect via gcloud sql connect or from inside a running pod:

# Option 1: Connect via gcloud (from your local machine)
gcloud sql connect $SQL_INSTANCE --user=postgres --database=postgres

# Option 2: Connect from inside a pod with the Cloud SQL Auth Proxy sidecar
kubectl exec -it -n $GKE_NAMESPACE deploy/crewai-web -c $(kubectl get deploy crewai-web -n $GKE_NAMESPACE -o jsonpath='{.spec.template.spec.containers[0].name}') -- \
  bash -c 'PGPASSWORD=YOUR_PASSWORD psql -h 127.0.0.1 -U postgres -d postgres'

Then run the following SQL commands (replace GSA_NAME@GCP_PROJECT_ID.iam with your actual IAM user, e.g., crewai-platform@jr-testing-487713.iam):

-- Grant connect + full privileges on each database
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE wharf TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Switch to primary database and grant schema/table/sequence access
\c crewai_plus_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Repeat for cable database
\c crewai_plus_cable_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Repeat for OAuth database
\c crewai_plus_oauth_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

-- Wharf trace database
\c wharf
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";

Migrating from password-based to IAM auth? If your databases already have tables created by a password-based user (e.g., crewai), you must run the GRANT ALL ON ALL TABLES and GRANT ALL ON ALL SEQUENCES commands above for each database. Without these grants, the IAM user will get permission denied for table schema_migrations errors during migrations. The ALTER DEFAULT PRIVILEGES statements only apply to future objects — they do not retroactively grant access to existing tables.

Step 5: Transfer table ownership to the IAM user. PostgreSQL DDL operations (e.g., ALTER TABLE, DROP COLUMN) require the executing user to be the owner of the table — GRANT ALL PRIVILEGES is not sufficient. If the databases were initially set up by a different user (e.g., postgres or a password-based crewai user), the IAM user will not own existing tables, and future upgrade migrations that modify table structure will fail. The pre-upgrade migration hook automatically detects this condition and will report the affected tables with the exact SQL to fix them. To prevent this from blocking your first upgrade, transfer ownership proactively. Connect to the database as the postgres superuser using the same host and port the application uses:

PGPASSWORD=<postgres-password> psql -h "$DB_HOST" -p "${DB_PORT:-5432}" -U postgres -d "$POSTGRES_DB"

Then run the following SQL (replace the IAM user with your actual value, e.g., crewai-platform@your-project.iam):

-- Transfer all tables and sequences in the primary database
\c crewai_plus_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;

-- Repeat for cable database
\c crewai_plus_cable_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;

-- Repeat for OAuth database
\c crewai_plus_oauth_production
DO $$ DECLARE r RECORD; BEGIN
  FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
  FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
    EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
  END LOOP;
END $$;

Any tables the IAM user creates going forward (via Rails migrations) are automatically owned by it, so this transfer only needs to be done once for pre-existing tables.

Use the postgres superuser for these commands. Only the current owner or a superuser can transfer table ownership. A non-superuser like crewai can run GRANT commands on objects it owns, but cannot run ALTER TABLE ... OWNER TO for tables it doesn’t own.

Cloud SQL Auth Proxy (Recommended)

The Helm chart includes a built-in Cloud SQL Auth Proxy sidecar. When enabled, it runs alongside the web, worker, OAuth, Wharf, and job containers, authenticating via Workload Identity. The app connects to 127.0.0.1 through the proxy.

# Helm values
cloudSqlProxy:
  enabled: true
  instanceConnectionName: "your-project:us-central1:crewai-db"  # GCP_PROJECT_ID:GCP_REGION:INSTANCE
  port: 5432
  privateIp: true           # Use private IP (recommended for VPC-peered instances)
  autoIamAuthn: true         # IAM-based authentication (no password needed)

If you prefer password-based auth through the proxy, set autoIamAuthn: false and provide DB_PASSWORD via secrets.

Helm Database Configuration

For IAM authentication (autoIamAuthn: true):

postgres:
  enabled: false  # Disable built-in PostgreSQL

envVars:
  DB_HOST: "127.0.0.1"  # Cloud SQL Auth Proxy
  DB_PORT: "5432"
  DB_USER: "crewai-platform@your-project.iam"  # GSA email minus .gserviceaccount.com
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"  # Set explicitly — chart default is "oauth_db"

DB_USER format for Cloud SQL IAM authentication: Use the GSA email WITHOUT the .gserviceaccount.com suffix. Cloud SQL Auth Proxy strips this suffix internally during IAM token exchange — including it causes authentication to fail silently.

Correct: crewai-platform@your-project.iam
Wrong: crewai-platform@your-project.iam.gserviceaccount.com

The full GSA email (with suffix) is used for gcloud iam service-accounts create and IAM bindings. The truncated form is used ONLY for DB_USER in Helm values.

You must explicitly set POSTGRES_OAUTH_DB in your Helm values. The chart default is oauth_db, but this GCP guide instructs you to create a database named crewai_plus_oauth_production. If your actual database name differs from the chart default and you do not override POSTGRES_OAUTH_DB, the OAuth service attempts to connect to a database that does not exist and fails silently at startup.Set it explicitly in envVars: to match the database you created:

envVars:
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"  # Must match the database name above

No DB_PASSWORD is needed — the Cloud SQL Auth Proxy handles authentication automatically via Workload Identity. For password-based authentication (autoIamAuthn: false):

postgres:
  enabled: false

envVars:
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"  # Set explicitly — chart default is "oauth_db"

secrets:
  DB_PASSWORD: "YOUR_SECURE_PASSWORD"

Google Cloud Storage for Object Storage

CrewAI Platform uses GCS for storing crew artifacts, tool outputs, and user uploads. With Workload Identity, no static credentials are needed.

Create GCS Bucket

export GCS_BUCKET="crewai-prod-storage"

gsutil mb -p $GCP_PROJECT_ID -l $GCP_REGION -b on gs://$GCS_BUCKET

# Enable versioning for data protection
gsutil versioning set on gs://$GCS_BUCKET

# Set lifecycle rule to clean up old versions (optional)
cat > /tmp/lifecycle.json << 'EOF'
{
  "rule": [{
    "action": {"type": "Delete"},
    "condition": {"numNewerVersions": 3, "isLive": false}
  }]
}
EOF
gsutil lifecycle set /tmp/lifecycle.json gs://$GCS_BUCKET

Helm Configuration for GCS

envVars:
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-gcp-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"
  # Optional: explicit GSA email for signed URLs (auto-detected from metadata server if blank)
  # GCS_SIGNING_EMAIL: "crewai-platform@your-gcp-project.iam.gserviceaccount.com"

No credentials or keyfile configuration is required. The google-cloud-storage gem uses Application Default Credentials (ADC), which are automatically provided by GKE Workload Identity.

GCS_IAM_SIGNING is required when using Workload Identity. Without it, GCS signed URL generation will fail with Google::Cloud::Storage::SignedUrlUnavailable whenever any file upload or download signed URL is generated — a hard failure that occurs in normal platform use. When enabled, the platform uses the IAM signBlob API instead, which requires the roles/iam.serviceAccountTokenCreator role granted in Step 2.If GCS_SIGNING_EMAIL is left blank, the service account email is automatically detected from the GKE metadata server.

Artifact Registry for Container Images

CrewAI Platform requires Artifact Registry for storing crew automation container images. When users create and deploy crews, CrewAI builds container images and pushes them to your registry.

Repository Requirements

Critical Requirements:

Repository URI must end in /crewai-enterprise once the platform appends its suffix
Immutable tags must be disabled (CrewAI overwrites tags for crew versions)

Create Artifact Registry Repository

export AR_REPO="crewai"

gcloud artifacts repositories create $AR_REPO \
  --repository-format=docker \
  --location=$GCP_REGION \
  --description="CrewAI Platform crew images"

# Verify the repository URI
gcloud artifacts repositories describe $AR_REPO \
  --location=$GCP_REGION \
  --format='value(name)'
# Output: projects/GCP_PROJECT_ID/locations/GCP_REGION/repositories/crewai

The resulting registry host is GCP_REGION-docker.pkg.dev. For example: us-central1-docker.pkg.dev/your-project/crewai. Valid repository URIs (set in CREW_IMAGE_REGISTRY_OVERRIDE):

us-central1-docker.pkg.dev/your-project/crewai
us-docker.pkg.dev/your-project/crewai
europe-west1-docker.pkg.dev/your-project/crewai

Helm Configuration for Artifact Registry

Do not set global.imageRegistry to your Artifact Registry host for standard deployments. Setting global.imageRegistry redirects all platform component image pulls (Redis, BuildKit, Wharf, etc.) away from images.crewai.com, causing ImagePullBackOff on every pod. CREW_IMAGE_REGISTRY_OVERRIDE is the only registry value needed for directing crew image builds to Artifact Registry.

envVars:
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
  # Note: /crewai-enterprise suffix is added automatically by CrewAI Platform

The Helm chart auto-detects the GAR host from CREW_IMAGE_REGISTRY_OVERRIDE when it matches the *-docker.pkg.dev pattern and sets GCP_ARTIFACT_REGISTRY_HOST automatically. BuildKit pods then obtain short-lived access tokens from the GKE metadata server to authenticate pushes.

How Crew Pods Pull Images from GAR

When a crew is deployed, the platform automatically injects a fresh short-lived GAR access token into the crew namespace’s image pull secret. This ensures crew pods can immediately pull their built images from Artifact Registry without any manual credential management. How it works:

Build pods push images to GAR via Workload Identity (using the GSA’s roles/artifactregistry.writer)
At deploy time, the platform fetches a fresh GAR access token from the GKE metadata server and merges it into the registry pull secret
Crew pods use this enriched pull secret to pull their images from GAR

This mechanism is fully automatic — it activates whenever the Helm chart auto-detects a GAR registry from CREW_IMAGE_REGISTRY_OVERRIDE.

Enable GKE Node Image Pulls from GAR (Defense-in-Depth)

As a fallback for scenarios where a crew pod is rescheduled to a new node after the injected token has expired, the node pool’s compute service account must have roles/artifactregistry.reader. This is configured in Step 2: Grant IAM Roles.

This IAM binding is required for production reliability. Without it, crew pods may fail to pull images if they are rescheduled to a new node more than one hour after initial deployment.

Verifying Artifact Registry Access

Test that the Workload Identity binding is working:

# From a pod in the namespace
kubectl run -it --rm gcloud-test \
  --namespace=$GKE_NAMESPACE \
  --image=google/cloud-sdk:slim \
  --serviceaccount=$KSA_NAME \
  --restart=Never -- bash

# Inside the pod:
gcloud auth print-access-token  # Should succeed via Workload Identity
gcloud artifacts docker images list ${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai

Bucket Deployment — Registry-Less Image Delivery

Bucket Deployment is an alternative to Artifact Registry that eliminates the need for any container registry. Instead of pushing built crew images to a registry, the platform stores them as compressed OCI tarballs in a GCS bucket and loads them directly into the container runtime on each Kubernetes node before deployment.

Bucket Deployment and Artifact Registry are mutually exclusive for crew image delivery. Choose one approach based on your requirements. Both still use BuildKit to build images — the difference is where the built image is stored and how it reaches the nodes.

When to Use Bucket Deployment

Scenario	Recommended Approach
Standard GKE deployment	Artifact Registry
Customer policy prohibits container registries	Bucket Deployment
Air-gapped or restricted network (no registry access)	Bucket Deployment
Existing GCS infrastructure, want to avoid GAR setup	Bucket Deployment
Multi-region with global GAR replication	Artifact Registry

How Bucket Deployment Works

The workflow has four phases: 1. Build — When a crew is deployed, BuildKit builds the container image as usual, but instead of pushing it to a registry, it outputs an OCI tarball. The tarball is compressed with gzip and uploaded to a GCS bucket using a Workload Identity access token. 2. Preload — A temporary Kubernetes DaemonSet is created, placing one pod on every node in the cluster. Each pod downloads the tarball from GCS, decompresses it, and imports it into the node’s containerd runtime using ctr images import. This makes the image available to the kubelet as if it had been pulled from a registry. 3. Deploy — Once all nodes have the image loaded, the crew is deployed via Helm with imagePullPolicy: Never and no imagePullSecrets. The kubelet finds the image in its local store and starts the pod normally. 4. Cleanup — The preloader DaemonSet is deleted. The imported images remain in the node’s containerd store.

GCS Bucket                    Kubernetes Cluster
┌─────────────────────┐       ┌────────────────────────────────────┐
│ crew-images/        │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 1: ctr import → ✓ cached  │
│                     │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 2: ctr import → ✓ cached  │
│                     │       │                                    │
│   <ref>.tar.gz ─────┼─wget──┼──→ Node 3: ctr import → ✓ cached  │
│                     │       │                                    │
└─────────────────────┘       │  All nodes ready → Helm deploy     │
                              │  (imagePullPolicy: Never)          │
                              │  → Preloader DaemonSet deleted     │
                              └────────────────────────────────────┘

The preloader DaemonSet runs with privileged: true to access the node’s containerd socket and ctr binary. This elevated access is ephemeral — it only exists during the deployment process and is cleaned up immediately after the image is loaded. The crew pods themselves run with normal, unprivileged security contexts.

Prerequisites

Before enabling Bucket Deployment, ensure you have:

GCS bucket — A dedicated bucket for storing crew image tarballs
IAM permissions — The Workload Identity GSA must have roles/storage.objectAdmin on the image bucket (for both upload and download)
Workload Identity — Already configured per Step 3
BuildKit — Enabled in the Helm chart (buildkit.enabled: true)

If you already configured roles/storage.objectAdmin at the project level for GCS object storage (see Step 2), the same binding covers the image bucket. No additional IAM configuration is needed — skip to Create the Image Bucket.

Create the Image Bucket

Create a dedicated GCS bucket for crew image tarballs. This bucket is separate from the general-purpose GCS_BUCKET used for crew artifacts and uploads.

export IMAGE_BUCKET="crewai-crew-images"

gcloud storage buckets create gs://$IMAGE_BUCKET \
  --project=$GCP_PROJECT_ID \
  --location=$GCP_REGION \
  --uniform-bucket-level-access

If you scoped roles/storage.objectAdmin to the general-purpose bucket (rather than project-level), grant access to the image bucket separately:

gcloud storage buckets add-iam-policy-binding gs://$IMAGE_BUCKET \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

Lifecycle policies are not recommended for the image bucket. Image tarballs are actively referenced by running crews and must remain available for node rescheduling or cluster scaling events. If you need cleanup, delete tarballs only after the corresponding crew has been undeployed.

Helm Configuration for Bucket Deployment

Set the following environment variables in your Helm values to enable Bucket Deployment:

envVars:
  # Switch the crew deployment provider to bucket mode
  PROVIDER: BUCKET_BUILDKIT_KUBERNETES

  # GCS bucket for storing crew image tarballs (required)
  IMAGE_BUCKET_NAME: "crewai-crew-images"

  # Path prefix inside the bucket (optional, default: "crew-images")
  IMAGE_BUCKET_PREFIX: "crew-images"

  # Image reference name — still required even without a registry.
  # Used as the image name when importing into containerd and
  # referencing in Kubernetes pod specs.
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"

CREW_IMAGE_REGISTRY_OVERRIDE is still required even though no registry is used. This value serves as the image name/tag throughout the build, preload, and deploy pipeline. The image is tagged with this name when imported into containerd, and Kubernetes pods reference it in their image: field. Do not remove it.

Configuration Reference

Environment Variable	Required	Default	Description
`PROVIDER`	Yes	`BUILDKIT_KUBERNETES`	Set to `BUCKET_BUILDKIT_KUBERNETES` to enable bucket mode
`IMAGE_BUCKET_NAME`	Yes	—	GCS bucket name for image tarballs
`IMAGE_BUCKET_PREFIX`	No	`crew-images`	Path prefix (folder) inside the bucket
`CREW_IMAGE_REGISTRY_OVERRIDE`	Yes	—	Image name reference (used as the containerd image tag)
`CONTAINERD_SOCKET_PATH`	No	`/run/containerd/containerd.sock`	Path to the containerd socket on nodes
`CTR_HOST_PATH`	No	`/usr/bin`	Host directory containing the `ctr` binary

The CONTAINERD_SOCKET_PATH and CTR_HOST_PATH defaults are correct for standard GKE nodes (Container-Optimized OS and Ubuntu). Override them only if your cluster uses custom node images with non-standard containerd paths.

What Changes from Artifact Registry Mode

When switching from Artifact Registry to Bucket Deployment, the following Helm values change:

Value	Artifact Registry	Bucket Deployment
`envVars.PROVIDER`	`BUILDKIT_KUBERNETES` (default)	`BUCKET_BUILDKIT_KUBERNETES`
`envVars.IMAGE_BUCKET_NAME`	Not set	Your GCS bucket name
`envVars.IMAGE_BUCKET_PREFIX`	Not set	`crew-images` (or custom)
`envVars.CREW_IMAGE_REGISTRY_OVERRIDE`	GAR path (images pushed here)	Same value (used as image name only)
Artifact Registry IAM roles	Required	Not needed
Node pool `artifactregistry.reader`	Required	Not needed
Crews namespace Workload Identity	Required (build pods push to GAR)	Required (build pods upload to GCS)

Verifying Bucket Deployment

After deploying a crew, verify the bucket deployment pipeline: 1. Check the image tarball was uploaded:

gcloud storage ls gs://$IMAGE_BUCKET/crew-images/
# Should list .tar.gz files for deployed crews

2. Check preloader DaemonSet (during deployment):

# The preloader is ephemeral — only visible during active deployments
kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader

3. Verify the image is loaded on nodes:

# SSH into a node and check containerd
gcloud compute ssh NODE_NAME --zone=ZONE -- \
  sudo ctr -n k8s.io images ls | grep crewai

4. Confirm crew pods are running with imagePullPolicy: Never:

kubectl get pods -n crewai-crews -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].imagePullPolicy}{"\n"}{end}'
# Should show "Never" for crew pods deployed via bucket mode

Secret Manager Integration (Optional)

GCP Secret Manager provides centralized secret management for CrewAI Platform.

Which Secrets to Store

Store in Secret Manager (sensitive, need rotation):

DB_PASSWORD - Database credentials (if not using IAM auth)
SECRET_KEY_BASE - Rails secret key
GITHUB_TOKEN - For private repository access
Auth provider secrets (ENTRA_ID_CLIENT_SECRET, OKTA_CLIENT_SECRET, etc.)

Keep in values.yaml (configuration, not secrets):

DB_HOST, DB_PORT, DB_USER, POSTGRES_DB
GCS_PROJECT_ID, GCS_BUCKET
APPLICATION_HOST, AUTH_PROVIDER

External Secrets Operator Setup

CrewAI uses External Secrets Operator (ESO) to sync secrets from Secret Manager to Kubernetes. Install ESO (if not already installed):

helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets \
  external-secrets/external-secrets \
  --namespace external-secrets-operator \
  --create-namespace

Helm Configuration for Secret Manager

secretStore:
  enabled: true
  provider: "gcp"
  gcp:
    projectID: "your-gcp-project"
    auth:
      workloadIdentity:
        enabled: true
        clusterLocation: "us-central1"
        clusterName: "your-cluster-name"
        serviceAccount: "crewai-sa"

externalSecret:
  enabled: true
  secretStoreKind: SecretStore
  secretStore: "crewai-secret-store"
  secretPath: "crewai-platform"           # Secret name in Secret Manager
  databaseSecretPath: "crewai-db-password" # Separate secret for DB password

External Access (Gateway API or Ingress)

GKE provides built-in support for the Kubernetes Gateway API, which is the recommended way to expose services externally.

The NGINX Ingress Controller was retired in March 2026. For new GKE deployments, Gateway API is recommended over traditional Ingress resources. Existing Ingress configurations continue to work.

Option 1: Gateway API (Recommended)

GKE ships with built-in GatewayClass resources — no additional controller installation is needed, but Gateway API support must be enabled on the cluster.

Enable Gateway API on GKE

gcloud container clusters update YOUR_CLUSTER_NAME \
  --gateway-api=standard \
  --region=$GCP_REGION \
  --project=$GCP_PROJECT_ID

Verify the GatewayClass resources are available (this may take a minute to propagate):

kubectl get gatewayclass

You should see the following classes:

GatewayClass	Type	Use Case
`gke-l7-global-external-managed`	Global external ALB	Multi-region, CDN, global anycast IP
`gke-l7-regional-external-managed`	Regional external ALB	Single-region, lower latency
`gke-l7-rilb`	Regional internal ALB	Internal-only access within VPC

Use case	Recommended GatewayClass
Single-region deployment	`gke-l7-regional-external-managed` (lower latency, lower cost)
Multi-region or CDN-fronted	`gke-l7-global-external-managed`

For most single-region production deployments, the regional class reduces latency by keeping traffic within a single region and avoids the global load balancer premium. Choose global only if you require multi-region failover or Cloud CDN integration.

If kubectl get gatewayclass returns “No resources found”, the Gateway API CRDs are not installed. The Helm chart will fail with no matches for kind "Gateway". Run the gcloud container clusters update command above and wait for it to complete before deploying.

Merge all web: settings under a single top-level web: key in your final values.yaml. Helm silently drops all but the last occurrence of a duplicate top-level key.

Helm Configuration

# Gateway API configuration
gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-regional-external-managed  # Use gke-l7-global-external-managed for multi-region deployments or CDN integration
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - name: crewai-tls
    - name: http
      protocol: HTTP
      port: 80

web:
  gateway:
    enabled: true
    hostnames:
      - "crewai.your-company.com"

# If OAuth is enabled (shared hostname with /oauthsvc prefix)
oauth:
  enabled: true
  gateway:
    enabled: true
    pathPrefix: "/oauthsvc"

If OAuth requires a dedicated hostname (e.g., because GKE does not support NGINX-style regex path rewriting), set pathPrefix: "/" and specify the hostname:

oauth:
  enabled: true
  gateway:
    enabled: true
    hostname: "oauth.your-company.com"
    pathPrefix: "/"

When using a dedicated OAuth hostname, add it to the Gateway TLS certificate (or create a separate certificate map entry) and update DNS to point to the same Gateway IP.

To use a GCP-managed certificate instead of a Kubernetes TLS secret, add the annotation:

gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-global-external-managed
  annotations:
    networking.gke.io/certmap: "crewai-cert-map"
  listeners:
    - name: https
      protocol: HTTPS
      port: 443

To create the certificate map:

# Create a managed certificate
gcloud certificate-manager certificates create crewai-cert \
  --domains="crewai.your-company.com"

# Create a certificate map and entry
gcloud certificate-manager maps create crewai-cert-map
gcloud certificate-manager maps entries create crewai-cert-entry \
  --map=crewai-cert-map \
  --certificates=crewai-cert \
  --hostname="crewai.your-company.com"

If you already have a Gateway resource (e.g., shared across multiple apps), reference it instead of creating one:

gateway:
  enabled: true
  create: false
  name: "shared-gateway"
  namespace: "gateway-infra"

web:
  gateway:
    enabled: true

After deploying, get the load balancer IP:

kubectl get gateway -n $GKE_NAMESPACE
# Note the ADDRESS column — update your DNS record to point to it

The Helm chart automatically creates a GKE HealthCheckPolicy that configures the load balancer to use /health as the health check path. Without this, GKE’s health probes use the pod IP as the Host header, which Rails’ HostAuthorization middleware blocks — causing unconditional drop overload errors. This is handled automatically; no manual configuration is needed.

Option 2: GCE Ingress (Native GKE)

If you prefer traditional Ingress resources:

web:
  ingress:
    enabled: true
    className: gce
    host: "crewai.your-company.com"
    annotations:
      kubernetes.io/ingress.global-static-ip-name: "crewai-ip"
      networking.gke.io/managed-certificates: "crewai-cert"

Option 3: NGINX Ingress Controller (Deprecated)

The NGINX Ingress Controller was retired in March 2026. Consider migrating to Gateway API for new deployments.

GKE does not include a pre-installed NGINX Ingress Controller. Using className: nginx without first installing the controller will create an Ingress resource that never receives an external IP. For new GKE deployments, use Option 1 (Gateway API) or Option 2 (GCE Ingress).

web:
  ingress:
    enabled: true
    className: nginx
    host: "crewai.your-company.com"
    nginx:
      tls:
        enabled: true
        secretName: "crewai-tls"

Complete GCP Deployment Example

If using WorkOS authentication, place WORKOS_API_KEY under envVars: — NOT under secrets:. See the WorkOS SSO guide for the full explanation and the correct YAML placement. This is a known chart limitation.

Here is a complete production configuration for GCP:

# values-gcp-production.yaml

# ServiceAccount with Workload Identity annotation
serviceAccount:
  annotations:
    iam.gke.io/gcp-service-account: crewai-platform@your-project.iam.gserviceaccount.com

# Namespace for crew workloads (default: crewai-crews)
crewNamespace: "crewai-crews"

# Image pull credentials — required for pulling platform images from images.crewai.com.
# When installing via direct Helm (not KOTS), Replicated proxy credentials must be
# provided here. The credentials are the same used for `helm registry login`.
image:
  registries:
    - host: "images.crewai.com"
      username: "your-email@company.com"
      password: "your-replicated-license-token"

# Disable internal services (use GCP managed services)
postgres:
  enabled: false

minio:
  enabled: false

# Cloud SQL Auth Proxy
cloudSqlProxy:
  enabled: true
  instanceConnectionName: "your-project:us-central1:crewai-db"
  port: 5432
  privateIp: true
  autoIamAuthn: true

envVars:
  # Database (via Cloud SQL Auth Proxy)
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai-platform@your-project.iam"  # GSA email without .gserviceaccount.com
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"

  # GCS for object storage
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"

  # Artifact Registry for crew images
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"

  # Application
  APPLICATION_HOST: "crewai.your-company.com"
  AUTH_PROVIDER: "<your-auth-provider>"  # workos, entra_id, okta, keycloak, or local
  RAILS_LOG_LEVEL: "info"

# DB_PASSWORD is not needed when using Cloud SQL IAM auth (autoIamAuthn: true).
# For password-based auth, set it in secrets:
# secrets:
#   DB_PASSWORD: "your-secure-password"

# Gateway API (recommended over NGINX Ingress)
gateway:
  enabled: true
  create: true
  gatewayClassName: gke-l7-regional-external-managed  # Use gke-l7-global-external-managed for multi-region deployments or CDN integration
  annotations:
    networking.gke.io/certmap: "crewai-cert-map"
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
    - name: http
      protocol: HTTP
      port: 80

# Web
# Note: Merge all web: settings under this single top-level web: key in your final
# values.yaml. Helm silently drops all but the last occurrence of a duplicate top-level key.
web:
  replicaCount: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"
  gateway:
    enabled: true
    hostnames:
      - "crewai.your-company.com"

# Worker
worker:
  replicaCount: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"

# Remove this block if you do not need Built-in Integrations (Google Workspace, Microsoft 365, HubSpot, etc.)
oauth:
  enabled: true
  gateway:
    enabled: true
    pathPrefix: "/oauthsvc"

# BuildKit
buildkit:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      cpu: "500m"
      memory: "2Gi"
    limits:
      cpu: "4"
      memory: "8Gi"

# RBAC
rbac:
  create: true

The image.registries section is required when installing via direct Helm (helm install ... oci://registry.crewai.com/...). It provides credentials for pulling platform images (busybox, redis, buildkit, etc.) from the Replicated proxy at images.crewai.com. Use the same email and license token you used for helm registry login registry.crewai.com.When installing via Replicated KOTS, these credentials are injected automatically and image.registries is not needed.

Deploy:

helm install crewai-platform \
  oci://registry.crewai.com/crewai/stable/crewai-platform \
  --values values-gcp-production.yaml \
  --namespace crewai \
  --create-namespace

Post-install: After the first install, annotate the crews namespace ServiceAccount for Workload Identity (see Step 3):

kubectl annotate serviceaccount default -n crewai-crews --overwrite \
  iam.gke.io/gcp-service-account=crewai-platform@your-project.iam.gserviceaccount.com

Bucket Deployment Variant

To use Bucket Deployment instead of Artifact Registry, modify the envVars section in the example above:

envVars:
  # Database (via Cloud SQL Auth Proxy)
  DB_HOST: "127.0.0.1"
  DB_PORT: "5432"
  DB_USER: "crewai-platform@your-project.iam"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"

  # GCS for object storage
  STORAGE_SERVICE: "google"
  GCS_PROJECT_ID: "your-project"
  GCS_BUCKET: "crewai-prod-storage"
  GCS_IAM_SIGNING: "true"

  # Bucket Deployment (registry-less crew image delivery)
  PROVIDER: BUCKET_BUILDKIT_KUBERNETES
  IMAGE_BUCKET_NAME: "crewai-crew-images"
  IMAGE_BUCKET_PREFIX: "crew-images"

  # Image reference name (still required — used as containerd image tag)
  CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"

  # Application
  APPLICATION_HOST: "crewai.your-company.com"
  AUTH_PROVIDER: "entra_id"
  RAILS_LOG_LEVEL: "info"

When using Bucket Deployment, Artifact Registry IAM roles (roles/artifactregistry.writer, roles/artifactregistry.reader) are not needed. The roles/storage.objectAdmin role on the image bucket handles both upload (from build pods) and download (from preloader pods). The Crews namespace Workload Identity binding is still required — build pods use it to obtain GCS access tokens for uploading image tarballs.

Troubleshooting GCP-Specific Issues

Workload Identity Not Working

Symptoms: Pods get 403 Forbidden or could not retrieve default credentials errors. Verify Workload Identity is enabled on the cluster:

gcloud container clusters describe YOUR_CLUSTER \
  --region=$GCP_REGION \
  --format='value(workloadIdentityConfig.workloadPool)'
# Should output: GCP_PROJECT_ID.svc.id.goog

Check ServiceAccount annotation:

kubectl get serviceaccount $KSA_NAME -n $GKE_NAMESPACE -o yaml | grep gcp-service-account
# Should show: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com

Test from a pod:

kubectl run -it --rm wi-test \
  --namespace=$GKE_NAMESPACE \
  --image=google/cloud-sdk:slim \
  --serviceaccount=$KSA_NAME \
  --restart=Never -- \
  gcloud auth print-access-token

If this fails, verify the IAM binding:

gcloud iam service-accounts get-iam-policy \
  ${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com
# Should list the workloadIdentityUser binding for your KSA

GCS Signed URL Errors

Symptoms: Logs show Google::Cloud::Storage::SignedUrlUnavailable: Service account credentials 'issuer (client_email)' is missing when deploying crews. This happens when Workload Identity is used but IAM-based URL signing is not enabled. The google-cloud-storage gem cannot sign URLs without a private key; with Workload Identity, the IAM signBlob API must be used instead. Fix:

Ensure GCS_IAM_SIGNING: "true" is set in envVars
Grant the roles/iam.serviceAccountTokenCreator role to the GSA:

gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
  --member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/iam.serviceAccountTokenCreator"

Redeploy the Helm chart

GCS Access Denied

Symptoms: Logs show Google::Cloud::PermissionDeniedError for storage operations.

# Verify the GSA has storage permissions
gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --format="table(bindings.role)"

# Test from pod
kubectl exec -it deploy/crewai-web -n $GKE_NAMESPACE -- \
  ruby -e "require 'google/cloud/storage'; puts Google::Cloud::Storage.new.buckets.map(&:name)"

Cloud SQL Connection Failures

Symptoms: Pods show connection refused on 127.0.0.1:5432. Check the Cloud SQL Proxy sidecar is running:

kubectl get pods -n $GKE_NAMESPACE -l app.kubernetes.io/component=web -o jsonpath='{.items[0].status.containerStatuses[*].name}'
# Should include: cloud-sql-proxy

kubectl logs deploy/crewai-web -n $GKE_NAMESPACE -c cloud-sql-proxy

Verify the instance connection name:

gcloud sql instances describe $SQL_INSTANCE --format='value(connectionName)'
# Should match cloudSqlProxy.instanceConnectionName in your values

Cloud SQL IAM Authentication Failures

DB_USER must be the GSA email WITHOUT the .gserviceaccount.com suffix. Including the full domain will cause authentication failures.

Symptoms: fe_sendauth: no password supplied or issue connecting with your username/password. “no password supplied” — autoIamAuthn is not enabled in your Helm values:

cloudSqlProxy:
  autoIamAuthn: true  # Must be true for IAM auth

“issue connecting with your username/password” — the IAM token is being injected but rejected. Check these in order:

IAM authentication flag on the instance (most common miss — off by default):

gcloud sql instances describe $SQL_INSTANCE \
  --format='value(settings.databaseFlags)'
# Must include: cloudsql.iam_authentication=on

# If missing, enable it:
gcloud sql instances patch $SQL_INSTANCE \
  --database-flags=cloudsql.iam_authentication=on

IAM database user exists:

gcloud sql users list --instance=$SQL_INSTANCE --format="table(name,type)"
# Must show: GSA_NAME@GCP_PROJECT_ID.iam   CLOUD_IAM_SERVICE_ACCOUNT

# If missing:
gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
  --instance=$SQL_INSTANCE \
  --type=CLOUD_IAM_SERVICE_ACCOUNT

GSA has required IAM roles:

gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
  --format="table(bindings.role)"
# Must include: roles/cloudsql.client AND roles/cloudsql.instanceUser

DB_USER is in IAM format (in Helm values):

envVars:
  DB_USER: "crewai-platform@your-project.iam"  # NOT "crewai"

SQL privileges granted — the IAM user must have been granted access to the databases via GRANT ALL PRIVILEGES (see “Create Databases and User” section above)

Cloud SQL IAM: Table Ownership Errors During Upgrades

Symptoms: Rails migrations fail with PG::InsufficientPrivilege: ERROR: must be owner of table <table_name>. Root cause: PostgreSQL DDL operations (ALTER TABLE, DROP COLUMN, ADD INDEX, etc.) require the executing user to be the owner of the table. When using IAM authentication, the IAM database user typically does not own tables that were created by the postgres superuser or a previous password-based user. GRANT ALL PRIVILEGES grants read/write access but does not transfer ownership. Fix: Connect as the postgres superuser and transfer ownership of all tables, sequences, and views to the IAM user. See Step 5 of the IAM authentication setup for the exact SQL commands. Prevention: Run the ownership transfer SQL during initial IAM auth setup (Step 5), before the first Helm upgrade.

Artifact Registry Push Failures

Symptoms: Crew deployments fail with unauthorized or denied during image push. Verify the GSA has Artifact Registry write access:

gcloud artifacts repositories get-iam-policy $AR_REPO \
  --location=$GCP_REGION \
  --format="table(bindings.role, bindings.members)"

Check BuildKit pod logs:

# Find the most recent buildkit build pod
kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp

# Check logs for GAR auth
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep "GAR authentication"

Check Workload Identity annotation on the crews namespace default SA (build pods use this SA):

kubectl get serviceaccount default -n crewai-crews -o jsonpath='{.metadata.annotations}'
# Should include: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com

If missing, annotate it (see Step 3).

Artifact Registry Pull Failures (Crew Pods)

Symptoms: Crew pods in crewai-crews namespace show ImagePullBackOff or ErrImagePull with 403 Forbidden when pulling from *-docker.pkg.dev. This can have multiple causes: 1. Wrong pull secret name: The crew pod references a pull secret that doesn’t exist or has empty credentials.

# Check which secret the crew pod references
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.imagePullSecrets}'

# Verify the secret exists and has data
kubectl get secret SECRET_NAME -n crewai-crews -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d

If the decoded config shows {"auths":{}} (empty), the Helm chart’s image.registries may not be configured. See the Complete Example. 2. Node pool compute SA lacks GAR read access:

# Verify node pool OAuth scopes (should include cloud-platform)
gcloud container node-pools describe default-pool \
  --cluster YOUR_CLUSTER \
  --region $GCP_REGION \
  --format='value(config.oauthScopes)'

# Check if the compute SA has artifactregistry.reader
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects get-iam-policy $GCP_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:${GCE_SA} AND bindings.role:roles/artifactregistry" \
  --format="table(bindings.role)"

If the role is missing, grant it (see Step 2). 3. Verify the image exists in GAR:

gcloud artifacts docker images list \
  ${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai/crewai-enterprise \
  --include-tags

Build Pod Image Pull Failures

Symptoms: Build pods in the platform namespace show ErrImagePull for images from images.crewai.com. This means the registry pull secret lacks Replicated proxy credentials. Verify:

# Check the pull secret contents
kubectl get secret $(kubectl get pod POD_NAME -n $GKE_NAMESPACE -o jsonpath='{.spec.imagePullSecrets[0].name}') \
  -n $GKE_NAMESPACE -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | python3 -m json.tool

If images.crewai.com is not in the auths section, add Replicated proxy credentials to image.registries in your Helm values and redeploy (see the Complete Example).

Secret Manager Access Denied

Symptoms: ExternalSecret shows SecretSyncedError.

kubectl get externalsecret -n $GKE_NAMESPACE
kubectl describe externalsecret crewai-external-secret -n $GKE_NAMESPACE

kubectl get secretstore -n $GKE_NAMESPACE
kubectl describe secretstore crewai-secret-store -n $GKE_NAMESPACE

Bucket Deployment: Upload Failed

Symptoms: Crew deployment fails during the build phase with Bucket upload failed in the build pod logs. Check the build pod logs:

kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep -A5 "bucket"

Common causes:

Missing IMAGE_BUCKET_NAME — The environment variable is not set. Verify it appears in your Helm values under envVars.
Bucket does not exist — Verify the bucket exists:

gcloud storage buckets describe gs://$IMAGE_BUCKET

Missing IAM permissions — The Workload Identity GSA needs roles/storage.objectAdmin on the image bucket:

gcloud storage buckets get-iam-policy gs://$IMAGE_BUCKET \
  --format="table(bindings.role, bindings.members)" \
  | grep $GSA_NAME

Metadata server unreachable — Build pods obtain access tokens from the GCE metadata server. If the logs show Failed to get access token from metadata server, verify that Workload Identity is bound for the crews namespace (see Step 3).

Bucket Deployment: Preloader Timeout

Symptoms: Crew deployment hangs at the preload phase. The preloader DaemonSet pods are not reaching Ready state. Check the preloader DaemonSet and pod status:

kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader
kubectl get pods -n $GKE_NAMESPACE -l app=image-preloader -o wide
kubectl logs -n $GKE_NAMESPACE -l app=image-preloader --tail=50

Common causes:

Image tarball not found in bucket — The preloader downloads from the bucket. If the build phase failed silently or the object name doesn’t match:

gcloud storage ls gs://$IMAGE_BUCKET/crew-images/

GCS download permission denied — The preloader pods run in the platform namespace and use the platform KSA’s Workload Identity. Verify the GSA has read access to the image bucket.
containerd socket not accessible — The preloader mounts the host’s containerd socket. If the node uses a non-standard socket path, set CONTAINERD_SOCKET_PATH in your Helm values. Check the default path:

# SSH into a node
gcloud compute ssh NODE_NAME --zone=ZONE -- ls -la /run/containerd/containerd.sock

ctr binary not found — The preloader uses the host’s ctr binary from /usr/bin. If it’s in a different location on your nodes, set CTR_HOST_PATH in your Helm values:

gcloud compute ssh NODE_NAME --zone=ZONE -- which ctr

Bucket Deployment: ErrImageNeverPull

Symptoms: Crew pods show ErrImageNeverPull status after deployment. This means Kubernetes is set to imagePullPolicy: Never but the image is not present in the node’s containerd store. This can happen if:

Preloader didn’t complete on this node — The node may have been added to the cluster after the preloader DaemonSet ran. Redeploy the crew to trigger a fresh preload cycle.
containerd image was garbage collected — containerd may have cleaned up unused images. Redeploy the crew.
Image name mismatch — The CREW_IMAGE_REGISTRY_OVERRIDE value must be consistent across build, preload, and deploy. Verify:

# Check what image the pod is trying to use
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.containers[0].image}'

# Check what images are loaded on the node
gcloud compute ssh NODE_NAME --zone=ZONE -- \
  sudo ctr -n k8s.io images ls | grep crewai

The image name from the pod spec must exactly match an image in the ctr output.

AWS Integration Guide

Azure Integration Guide

⌘I

​Overview

​Prerequisites

​Required GCP Infrastructure

​Required GCP APIs

​Pre-Install Checklist

​Step 1: Create the GCP Service Account

​Step 2: Grant IAM Roles

​Step 3: Bind Workload Identity

​Post-Install: Annotate the Crew Namespace

​Cloud SQL for PostgreSQL

​Cloud SQL Instance Sizing

​Create the Cloud SQL Instance

​Create Databases and User

​Cloud SQL Auth Proxy (Recommended)

​Helm Database Configuration

​Google Cloud Storage for Object Storage

​Create GCS Bucket

​Helm Configuration for GCS

​Artifact Registry for Container Images

​Repository Requirements

​Create Artifact Registry Repository

​Helm Configuration for Artifact Registry

​How Crew Pods Pull Images from GAR

​Enable GKE Node Image Pulls from GAR (Defense-in-Depth)

​Verifying Artifact Registry Access

​Bucket Deployment — Registry-Less Image Delivery

​When to Use Bucket Deployment

​How Bucket Deployment Works

​Prerequisites

​Create the Image Bucket

​Helm Configuration for Bucket Deployment

​Configuration Reference

​What Changes from Artifact Registry Mode

​Verifying Bucket Deployment

​Secret Manager Integration (Optional)

​Which Secrets to Store

​External Secrets Operator Setup

​Helm Configuration for Secret Manager

​External Access (Gateway API or Ingress)

​Option 1: Gateway API (Recommended)

​Enable Gateway API on GKE

​Helm Configuration

​Option 2: GCE Ingress (Native GKE)

​Option 3: NGINX Ingress Controller (Deprecated)

​Complete GCP Deployment Example

​Bucket Deployment Variant

​Troubleshooting GCP-Specific Issues

​Workload Identity Not Working

​GCS Signed URL Errors

​GCS Access Denied

​Cloud SQL Connection Failures

​Cloud SQL IAM Authentication Failures

​Cloud SQL IAM: Table Ownership Errors During Upgrades

​Artifact Registry Push Failures

​Artifact Registry Pull Failures (Crew Pods)

​Build Pod Image Pull Failures

​Secret Manager Access Denied

​Bucket Deployment: Upload Failed

​Bucket Deployment: Preloader Timeout

​Bucket Deployment: ErrImageNeverPull

Overview

Prerequisites

Required GCP Infrastructure

Required GCP APIs

Pre-Install Checklist

Step 1: Create the GCP Service Account

Step 2: Grant IAM Roles

Step 3: Bind Workload Identity

Post-Install: Annotate the Crew Namespace

Cloud SQL for PostgreSQL

Cloud SQL Instance Sizing

Create the Cloud SQL Instance

Create Databases and User

Cloud SQL Auth Proxy (Recommended)

Helm Database Configuration

Google Cloud Storage for Object Storage

Create GCS Bucket

Helm Configuration for GCS

Artifact Registry for Container Images

Repository Requirements

Create Artifact Registry Repository

Helm Configuration for Artifact Registry

How Crew Pods Pull Images from GAR

Enable GKE Node Image Pulls from GAR (Defense-in-Depth)

Verifying Artifact Registry Access

Bucket Deployment — Registry-Less Image Delivery

When to Use Bucket Deployment

How Bucket Deployment Works

Prerequisites

Create the Image Bucket

Helm Configuration for Bucket Deployment

Configuration Reference

What Changes from Artifact Registry Mode

Verifying Bucket Deployment

Secret Manager Integration (Optional)

Which Secrets to Store

External Secrets Operator Setup

Helm Configuration for Secret Manager

External Access (Gateway API or Ingress)

Option 1: Gateway API (Recommended)

Enable Gateway API on GKE

Helm Configuration

Option 2: GCE Ingress (Native GKE)

Option 3: NGINX Ingress Controller (Deprecated)

Complete GCP Deployment Example

Bucket Deployment Variant

Troubleshooting GCP-Specific Issues

Workload Identity Not Working

GCS Signed URL Errors

GCS Access Denied

Cloud SQL Connection Failures

Cloud SQL IAM Authentication Failures

Cloud SQL IAM: Table Ownership Errors During Upgrades

Artifact Registry Push Failures

Artifact Registry Pull Failures (Crew Pods)

Build Pod Image Pull Failures

Secret Manager Access Denied

Bucket Deployment: Upload Failed

Bucket Deployment: Preloader Timeout

Bucket Deployment: ErrImageNeverPull