Overview
This guide covers GCP-specific integration for CrewAI Platform deployments on Google Kubernetes Engine (GKE). It focuses on configuring GCS for object storage, Artifact Registry (or registry-less Bucket Deployment) for crew image builds, Cloud SQL for PostgreSQL, and optionally Secret Manager — all authenticated via GKE Workload Identity Federation (no static keys).
This guide assumes you have:
- A GKE cluster running Kubernetes 1.28+ with Workload Identity enabled
gcloud CLI and kubectl configured
- Helm 3.10+ installed
- Basic familiarity with GCP services (Cloud SQL, GCS, Artifact Registry)
Prerequisites
Before configuring CrewAI Platform, ensure these GCP components are in place:
CrewAI Platform supports AMD64 (x86_64) Kubernetes worker nodes. ARM64 (aarch64) worker nodes are not currently supported. For full platform requirements, see the Requirements Guide.
Required GCP Infrastructure
| Component | Documentation Link | Notes |
|---|
| GKE Cluster | GKE Quickstart | 1.28+ with Workload Identity enabled |
| Workload Identity | Workload Identity Federation | Must be enabled at cluster and node pool level |
| VPC and Subnets | GKE Network Overview | Private cluster recommended |
Required GCP APIs
Enable the following APIs in your project before proceeding:
gcloud services enable \
container.googleapis.com \
sqladmin.googleapis.com \
storage-api.googleapis.com \
artifactregistry.googleapis.com \
secretmanager.googleapis.com \
certificatemanager.googleapis.com \
iam.googleapis.com
Do not proceed with CrewAI installation until these prerequisites are met. The Helm chart will fail to deploy without them.
Step 1: Create the GCP Service Account
All CrewAI workloads (web, worker, buildkit) share a single GCP Service Account (GSA) that is mapped to the Kubernetes ServiceAccount via Workload Identity. This GSA needs permissions for GCS, Artifact Registry, Cloud SQL, and optionally Secret Manager.
export GCP_PROJECT_ID="your-gcp-project"
export GCP_REGION="us-central1"
export GSA_NAME="crewai-platform"
export GKE_NAMESPACE="crewai" # Kubernetes namespace for the Helm release
export KSA_NAME="crewai-sa" # Kubernetes ServiceAccount (chart default)
# Create the GCP Service Account
gcloud iam service-accounts create $GSA_NAME \
--display-name="CrewAI Platform" \
--project=$GCP_PROJECT_ID
Step 2: Grant IAM Roles
Bind the required IAM roles to the service account. Each role maps to a specific CrewAI requirement:
| Role | Assigned To | Purpose |
|---|
roles/storage.objectAdmin | GSA | Read/write crew artifacts and uploads in GCS. Also covers image bucket read/write when using Bucket Deployment |
roles/iam.serviceAccountTokenCreator | GSA | Sign GCS URLs via IAM signBlob (required for Workload Identity) |
roles/artifactregistry.writer | GSA | Push and pull crew container images (build pods via Workload Identity). Not needed if using Bucket Deployment |
roles/cloudsql.client | GSA | Connect to Cloud SQL via the Auth Proxy |
roles/cloudsql.instanceUser | GSA | IAM-based database authentication (only if using autoIamAuthn: true) |
roles/artifactregistry.reader | Node pool GCE SA | Pull crew images from GAR (kubelet uses node-level credentials). Not needed if using Bucket Deployment |
roles/secretmanager.secretAccessor | GSA | Read secrets from Secret Manager (optional) |
# GCS - object storage for crew artifacts and uploads
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
# GCS - signed URL generation via IAM signBlob (required for Workload Identity)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountTokenCreator"
# Artifact Registry - push/pull crew container images
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/artifactregistry.writer"
# Cloud SQL - database connectivity via Cloud SQL Auth Proxy
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"
# Cloud SQL - IAM database authentication (only if using autoIamAuthn: true)
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/cloudsql.instanceUser"
# (Optional) Secret Manager - external secrets
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
# Node pool GCE SA - pull crew images from Artifact Registry
# The kubelet uses the node's compute SA (not Workload Identity) for image pulls
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GCE_SA}" \
--role="roles/artifactregistry.reader"
For tighter security, scope roles/storage.objectAdmin and roles/artifactregistry.writer to specific resources using --condition flags or bucket/repo-level IAM instead of project-level bindings.The node pool GCE SA binding is separate from the GSA bindings. The platform injects a short-lived GAR token at deploy time for immediate pulls, but the node SA provides reliable fallback when crew pods are rescheduled after the token expires.
Step 3: Bind Workload Identity
Create the IAM policy bindings that allow the Kubernetes ServiceAccounts to impersonate the GCP Service Account. Two bindings are needed: one for the platform namespace (web, worker, buildkit daemon) and one for the crews namespace (build pods that push images to GAR):
# Platform namespace — used by web, worker, and buildkit daemon pods
gcloud iam service-accounts add-iam-policy-binding \
${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[${GKE_NAMESPACE}/${KSA_NAME}]"
# Crews namespace — used by build pods that push crew images to Artifact Registry
gcloud iam service-accounts add-iam-policy-binding \
${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:${GCP_PROJECT_ID}.svc.id.goog[crewai-crews/default]"
After the initial Helm install (which creates the crewai-crews namespace automatically), annotate the default ServiceAccount in the crews namespace so build pods can authenticate to GAR via Workload Identity:
kubectl annotate serviceaccount default -n crewai-crews --overwrite \
iam.gke.io/gcp-service-account=${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com
The [NAMESPACE/KSA_NAME] in the member string must exactly match the Kubernetes namespace and the ServiceAccount name. The chart creates crewai-sa by default when rbac.create: true. The crews namespace (crewai-crews by default) uses the default ServiceAccount for build pods.The kubectl annotate command must run after the first helm install because the Helm chart creates the crewai-crews namespace. If you need to annotate before installing, create the namespace manually first: kubectl create namespace crewai-crews.
Cloud SQL for PostgreSQL
CrewAI Platform requires PostgreSQL 16+ for production deployments.
Cloud SQL Instance Sizing
Minimum recommended specifications based on CrewAI workload characteristics:
| Deployment Size | Machine Type | vCPU | RAM | Storage |
|---|
| Development | db-custom-2-4096 | 2 | 4 GiB | 50 GiB SSD |
| Small Production | db-custom-2-16384 | 2 | 16 GiB | 100 GiB SSD |
| Medium Production | db-custom-4-32768 | 4 | 32 GiB | 250 GiB SSD |
| Large Production | db-custom-8-65536 | 8 | 64 GiB | 500 GiB SSD |
Create the Cloud SQL Instance
export SQL_INSTANCE="crewai-db"
gcloud sql instances create $SQL_INSTANCE \
--database-version=POSTGRES_16 \
--tier=db-custom-2-16384 \
--region=$GCP_REGION \
--storage-size=100GB \
--storage-type=SSD \
--storage-auto-increase \
--network=default \
--no-assign-ip
Create Databases and User
CrewAI requires three databases (primary, cable, and OAuth), plus an optional fourth for Wharf tracing:
# Create databases
gcloud sql databases create crewai_plus_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_cable_production --instance=$SQL_INSTANCE
gcloud sql databases create crewai_plus_oauth_production --instance=$SQL_INSTANCE
# If using Wharf OTLP trace collector (wharf.enabled: true)
gcloud sql databases create wharf --instance=$SQL_INSTANCE
You can use either password-based or IAM-based authentication:
Option A: Password-based authentication (simpler)
gcloud sql users create crewai \
--instance=$SQL_INSTANCE \
--password="YOUR_SECURE_PASSWORD"
Option B: IAM-based authentication (no static passwords)
IAM authentication uses Workload Identity to authenticate the application to Cloud SQL via short-lived tokens. No DB_PASSWORD is needed.
Step 1: Enable IAM authentication on the Cloud SQL instance:
gcloud sql instances patch $SQL_INSTANCE \
--database-flags=cloudsql.iam_authentication=on
The cloudsql.iam_authentication flag is off by default. Without it, Cloud SQL rejects IAM tokens regardless of user or role configuration.
Step 2: Create the IAM database user:
gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
--instance=$SQL_INSTANCE \
--type=CLOUD_IAM_SERVICE_ACCOUNT
Step 3: Grant the GSA the Cloud SQL IAM login role:
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/cloudsql.instanceUser"
Step 4: Grant SQL privileges to the IAM user.
Connect to the Cloud SQL instance using the postgres superuser (recommended) or an existing admin user. You can connect via gcloud sql connect or from inside a running pod:
# Option 1: Connect via gcloud (from your local machine)
gcloud sql connect $SQL_INSTANCE --user=postgres --database=postgres
# Option 2: Connect from inside a pod with the Cloud SQL Auth Proxy sidecar
kubectl exec -it -n $GKE_NAMESPACE deploy/crewai-web -c $(kubectl get deploy crewai-web -n $GKE_NAMESPACE -o jsonpath='{.spec.template.spec.containers[0].name}') -- \
bash -c 'PGPASSWORD=YOUR_PASSWORD psql -h 127.0.0.1 -U postgres -d postgres'
Then run the following SQL commands (replace GSA_NAME@GCP_PROJECT_ID.iam with your actual IAM user, e.g., crewai-platform@jr-testing-487713.iam):
-- Grant connect + full privileges on each database
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO "GSA_NAME@GCP_PROJECT_ID.iam";
-- If using Wharf (wharf.enabled: true)
GRANT ALL PRIVILEGES ON DATABASE wharf TO "GSA_NAME@GCP_PROJECT_ID.iam";
-- Switch to primary database and grant schema/table/sequence access
\c crewai_plus_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";
-- Repeat for cable database
\c crewai_plus_cable_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";
-- Repeat for OAuth database
\c crewai_plus_oauth_production
GRANT ALL ON SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO "GSA_NAME@GCP_PROJECT_ID.iam";
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO "GSA_NAME@GCP_PROJECT_ID.iam";
Migrating from password-based to IAM auth? If your databases already have tables created by a password-based user (e.g., crewai), you must run the GRANT ALL ON ALL TABLES and GRANT ALL ON ALL SEQUENCES commands above for each database. Without these grants, the IAM user will get permission denied for table schema_migrations errors during migrations. The ALTER DEFAULT PRIVILEGES statements only apply to future objects — they do not retroactively grant access to existing tables.
Step 5: Transfer table ownership to the IAM user.
PostgreSQL DDL operations (e.g., ALTER TABLE, DROP COLUMN) require the executing user to be the owner of the table — GRANT ALL PRIVILEGES is not sufficient. If the databases were initially set up by a different user (e.g., postgres or a password-based crewai user), the IAM user will not own existing tables, and future upgrade migrations that modify table structure will fail.
The pre-upgrade migration hook automatically detects this condition and will report the affected tables with the exact SQL to fix them. To prevent this from blocking your first upgrade, transfer ownership proactively.
Connect to the database as the postgres superuser using the same host and port the application uses:
PGPASSWORD=<postgres-password> psql -h "$DB_HOST" -p "${DB_PORT:-5432}" -U postgres -d "$POSTGRES_DB"
Then run the following SQL (replace the IAM user with your actual value, e.g., crewai-platform@your-project.iam):
-- Transfer all tables and sequences in the primary database
\c crewai_plus_production
DO $$ DECLARE r RECORD; BEGIN
FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
END $$;
-- Repeat for cable database
\c crewai_plus_cable_production
DO $$ DECLARE r RECORD; BEGIN
FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
END $$;
-- Repeat for OAuth database
\c crewai_plus_oauth_production
DO $$ DECLARE r RECORD; BEGIN
FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER TABLE public.' || quote_ident(r.tablename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
FOR r IN SELECT sequencename FROM pg_sequences WHERE schemaname = 'public' LOOP
EXECUTE 'ALTER SEQUENCE public.' || quote_ident(r.sequencename) || ' OWNER TO "GSA_NAME@GCP_PROJECT_ID.iam"';
END LOOP;
END $$;
Any tables the IAM user creates going forward (via Rails migrations) are automatically owned by it, so this transfer only needs to be done once for pre-existing tables.
Use the postgres superuser for these commands. Only the current owner or a superuser can transfer table ownership. A non-superuser like crewai can run GRANT commands on objects it owns, but cannot run ALTER TABLE ... OWNER TO for tables it doesn’t own.
Cloud SQL Auth Proxy (Recommended)
The Helm chart includes a built-in Cloud SQL Auth Proxy sidecar. When enabled, it runs alongside the web, worker, OAuth, Wharf, and job containers, authenticating via Workload Identity. The app connects to 127.0.0.1 through the proxy.
# Helm values
cloudSqlProxy:
enabled: true
instanceConnectionName: "your-project:us-central1:crewai-db" # GCP_PROJECT_ID:GCP_REGION:INSTANCE
port: 5432
privateIp: true # Use private IP (recommended for VPC-peered instances)
autoIamAuthn: true # IAM-based authentication (no password needed)
If you prefer password-based auth through the proxy, set autoIamAuthn: false and provide DB_PASSWORD via secrets.
Helm Database Configuration
For IAM authentication (autoIamAuthn: true):
postgres:
enabled: false # Disable built-in PostgreSQL
envVars:
DB_HOST: "127.0.0.1" # Cloud SQL Auth Proxy
DB_PORT: "5432"
DB_USER: "crewai-platform@your-project.iam" # GSA email minus .gserviceaccount.com
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"
No DB_PASSWORD is needed — the Cloud SQL Auth Proxy handles authentication automatically via Workload Identity.
For password-based authentication (autoIamAuthn: false):
postgres:
enabled: false
envVars:
DB_HOST: "127.0.0.1"
DB_PORT: "5432"
DB_USER: "crewai"
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"
secrets:
DB_PASSWORD: "YOUR_SECURE_PASSWORD"
Google Cloud Storage for Object Storage
CrewAI Platform uses GCS for storing crew artifacts, tool outputs, and user uploads. With Workload Identity, no static credentials are needed.
Create GCS Bucket
export GCS_BUCKET="crewai-prod-storage"
gsutil mb -p $GCP_PROJECT_ID -l $GCP_REGION -b on gs://$GCS_BUCKET
# Enable versioning for data protection
gsutil versioning set on gs://$GCS_BUCKET
# Set lifecycle rule to clean up old versions (optional)
cat > /tmp/lifecycle.json << 'EOF'
{
"rule": [{
"action": {"type": "Delete"},
"condition": {"numNewerVersions": 3, "isLive": false}
}]
}
EOF
gsutil lifecycle set /tmp/lifecycle.json gs://$GCS_BUCKET
Helm Configuration for GCS
envVars:
STORAGE_SERVICE: "google"
GCS_PROJECT_ID: "your-gcp-project"
GCS_BUCKET: "crewai-prod-storage"
GCS_IAM_SIGNING: "true"
# Optional: explicit GSA email for signed URLs (auto-detected from metadata server if blank)
# GCS_SIGNING_EMAIL: "crewai-platform@your-gcp-project.iam.gserviceaccount.com"
No credentials or keyfile configuration is required. The google-cloud-storage gem uses Application Default Credentials (ADC), which are automatically provided by GKE Workload Identity.
GCS_IAM_SIGNING is required when using Workload Identity. Without it, GCS signed URL generation will fail because Workload Identity credentials don’t include a private key for local signing. When enabled, the platform uses the IAM signBlob API instead, which requires the roles/iam.serviceAccountTokenCreator role granted in Step 2.If GCS_SIGNING_EMAIL is left blank, the service account email is automatically detected from the GKE metadata server.
Artifact Registry for Container Images
CrewAI Platform requires Artifact Registry for storing crew automation container images. When users create and deploy crews, CrewAI builds container images and pushes them to your registry.
Repository Requirements
Critical Requirements:
- Repository URI must end in
/crewai-enterprise once the platform appends its suffix
- Immutable tags must be disabled (CrewAI overwrites tags for crew versions)
Create Artifact Registry Repository
export AR_REPO="crewai"
gcloud artifacts repositories create $AR_REPO \
--repository-format=docker \
--location=$GCP_REGION \
--description="CrewAI Platform crew images"
# Verify the repository URI
gcloud artifacts repositories describe $AR_REPO \
--location=$GCP_REGION \
--format='value(name)'
# Output: projects/GCP_PROJECT_ID/locations/GCP_REGION/repositories/crewai
The resulting registry host is GCP_REGION-docker.pkg.dev. For example: us-central1-docker.pkg.dev/your-project/crewai.
Valid repository URIs (set in CREW_IMAGE_REGISTRY_OVERRIDE):
us-central1-docker.pkg.dev/your-project/crewai
us-docker.pkg.dev/your-project/crewai
europe-west1-docker.pkg.dev/your-project/crewai
Helm Configuration for Artifact Registry
envVars:
CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
# Note: /crewai-enterprise suffix is added automatically by CrewAI Platform
The Helm chart auto-detects the GAR host from CREW_IMAGE_REGISTRY_OVERRIDE when it matches the *-docker.pkg.dev pattern and sets GCP_ARTIFACT_REGISTRY_HOST automatically. BuildKit pods then obtain short-lived access tokens from the GKE metadata server to authenticate pushes.
How Crew Pods Pull Images from GAR
When a crew is deployed, the platform automatically injects a fresh short-lived GAR access token into the crew namespace’s image pull secret. This ensures crew pods can immediately pull their built images from Artifact Registry without any manual credential management.
How it works:
- Build pods push images to GAR via Workload Identity (using the GSA’s
roles/artifactregistry.writer)
- At deploy time, the platform fetches a fresh GAR access token from the GKE metadata server and merges it into the registry pull secret
- Crew pods use this enriched pull secret to pull their images from GAR
This mechanism is fully automatic — it activates whenever the Helm chart auto-detects a GAR registry from CREW_IMAGE_REGISTRY_OVERRIDE.
Enable GKE Node Image Pulls from GAR (Defense-in-Depth)
As a fallback for scenarios where a crew pod is rescheduled to a new node after the injected token has expired, the node pool’s compute service account must have roles/artifactregistry.reader. This is configured in Step 2: Grant IAM Roles.
This IAM binding is required for production reliability. Without it, crew pods may fail to pull images if they are rescheduled to a new node more than one hour after initial deployment.
Verifying Artifact Registry Access
Test that the Workload Identity binding is working:
# From a pod in the namespace
kubectl run -it --rm gcloud-test \
--namespace=$GKE_NAMESPACE \
--image=google/cloud-sdk:slim \
--serviceaccount=$KSA_NAME \
--restart=Never -- bash
# Inside the pod:
gcloud auth print-access-token # Should succeed via Workload Identity
gcloud artifacts docker images list ${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai
Bucket Deployment — Registry-Less Image Delivery
Bucket Deployment is an alternative to Artifact Registry that eliminates the need for any container registry. Instead of pushing built crew images to a registry, the platform stores them as compressed OCI tarballs in a GCS bucket and loads them directly into the container runtime on each Kubernetes node before deployment.
Bucket Deployment and Artifact Registry are mutually exclusive for crew image delivery. Choose one approach based on your requirements. Both still use BuildKit to build images — the difference is where the built image is stored and how it reaches the nodes.
When to Use Bucket Deployment
| Scenario | Recommended Approach |
|---|
| Standard GKE deployment | Artifact Registry |
| Customer policy prohibits container registries | Bucket Deployment |
| Air-gapped or restricted network (no registry access) | Bucket Deployment |
| Existing GCS infrastructure, want to avoid GAR setup | Bucket Deployment |
| Multi-region with global GAR replication | Artifact Registry |
How Bucket Deployment Works
The workflow has four phases:
1. Build — When a crew is deployed, BuildKit builds the container image as usual, but instead of pushing it to a registry, it outputs an OCI tarball. The tarball is compressed with gzip and uploaded to a GCS bucket using a Workload Identity access token.
2. Preload — A temporary Kubernetes DaemonSet is created, placing one pod on every node in the cluster. Each pod downloads the tarball from GCS, decompresses it, and imports it into the node’s containerd runtime using ctr images import. This makes the image available to the kubelet as if it had been pulled from a registry.
3. Deploy — Once all nodes have the image loaded, the crew is deployed via Helm with imagePullPolicy: Never and no imagePullSecrets. The kubelet finds the image in its local store and starts the pod normally.
4. Cleanup — The preloader DaemonSet is deleted. The imported images remain in the node’s containerd store.
GCS Bucket Kubernetes Cluster
┌─────────────────────┐ ┌────────────────────────────────────┐
│ crew-images/ │ │ │
│ <ref>.tar.gz ─────┼─wget──┼──→ Node 1: ctr import → ✓ cached │
│ │ │ │
│ <ref>.tar.gz ─────┼─wget──┼──→ Node 2: ctr import → ✓ cached │
│ │ │ │
│ <ref>.tar.gz ─────┼─wget──┼──→ Node 3: ctr import → ✓ cached │
│ │ │ │
└─────────────────────┘ │ All nodes ready → Helm deploy │
│ (imagePullPolicy: Never) │
│ → Preloader DaemonSet deleted │
└────────────────────────────────────┘
The preloader DaemonSet runs with privileged: true to access the node’s containerd socket and ctr binary. This elevated access is ephemeral — it only exists during the deployment process and is cleaned up immediately after the image is loaded. The crew pods themselves run with normal, unprivileged security contexts.
Prerequisites
Before enabling Bucket Deployment, ensure you have:
- GCS bucket — A dedicated bucket for storing crew image tarballs
- IAM permissions — The Workload Identity GSA must have
roles/storage.objectAdmin on the image bucket (for both upload and download)
- Workload Identity — Already configured per Step 3
- BuildKit — Enabled in the Helm chart (
buildkit.enabled: true)
If you already configured roles/storage.objectAdmin at the project level for GCS object storage (see Step 2), the same binding covers the image bucket. No additional IAM configuration is needed — skip to Create the Image Bucket.
Create the Image Bucket
Create a dedicated GCS bucket for crew image tarballs. This bucket is separate from the general-purpose GCS_BUCKET used for crew artifacts and uploads.
export IMAGE_BUCKET="crewai-crew-images"
gcloud storage buckets create gs://$IMAGE_BUCKET \
--project=$GCP_PROJECT_ID \
--location=$GCP_REGION \
--uniform-bucket-level-access
If you scoped roles/storage.objectAdmin to the general-purpose bucket (rather than project-level), grant access to the image bucket separately:
gcloud storage buckets add-iam-policy-binding gs://$IMAGE_BUCKET \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
Lifecycle policies are not recommended for the image bucket. Image tarballs are actively referenced by running crews and must remain available for node rescheduling or cluster scaling events. If you need cleanup, delete tarballs only after the corresponding crew has been undeployed.
Helm Configuration for Bucket Deployment
Set the following environment variables in your Helm values to enable Bucket Deployment:
envVars:
# Switch the crew deployment provider to bucket mode
PROVIDER: BUCKET_BUILDKIT_KUBERNETES
# GCS bucket for storing crew image tarballs (required)
IMAGE_BUCKET_NAME: "crewai-crew-images"
# Path prefix inside the bucket (optional, default: "crew-images")
IMAGE_BUCKET_PREFIX: "crew-images"
# Image reference name — still required even without a registry.
# Used as the image name when importing into containerd and
# referencing in Kubernetes pod specs.
CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
CREW_IMAGE_REGISTRY_OVERRIDE is still required even though no registry is used. This value serves as the image name/tag throughout the build, preload, and deploy pipeline. The image is tagged with this name when imported into containerd, and Kubernetes pods reference it in their image: field. Do not remove it.
Configuration Reference
| Environment Variable | Required | Default | Description |
|---|
PROVIDER | Yes | BUILDKIT_KUBERNETES | Set to BUCKET_BUILDKIT_KUBERNETES to enable bucket mode |
IMAGE_BUCKET_NAME | Yes | — | GCS bucket name for image tarballs |
IMAGE_BUCKET_PREFIX | No | crew-images | Path prefix (folder) inside the bucket |
CREW_IMAGE_REGISTRY_OVERRIDE | Yes | — | Image name reference (used as the containerd image tag) |
CONTAINERD_SOCKET_PATH | No | /run/containerd/containerd.sock | Path to the containerd socket on nodes |
CTR_HOST_PATH | No | /usr/bin | Host directory containing the ctr binary |
The CONTAINERD_SOCKET_PATH and CTR_HOST_PATH defaults are correct for standard GKE nodes (Container-Optimized OS and Ubuntu). Override them only if your cluster uses custom node images with non-standard containerd paths.
What Changes from Artifact Registry Mode
When switching from Artifact Registry to Bucket Deployment, the following Helm values change:
| Value | Artifact Registry | Bucket Deployment |
|---|
envVars.PROVIDER | BUILDKIT_KUBERNETES (default) | BUCKET_BUILDKIT_KUBERNETES |
envVars.IMAGE_BUCKET_NAME | Not set | Your GCS bucket name |
envVars.IMAGE_BUCKET_PREFIX | Not set | crew-images (or custom) |
envVars.CREW_IMAGE_REGISTRY_OVERRIDE | GAR path (images pushed here) | Same value (used as image name only) |
| Artifact Registry IAM roles | Required | Not needed |
Node pool artifactregistry.reader | Required | Not needed |
| Crews namespace Workload Identity | Required (build pods push to GAR) | Required (build pods upload to GCS) |
Verifying Bucket Deployment
After deploying a crew, verify the bucket deployment pipeline:
1. Check the image tarball was uploaded:
gcloud storage ls gs://$IMAGE_BUCKET/crew-images/
# Should list .tar.gz files for deployed crews
2. Check preloader DaemonSet (during deployment):
# The preloader is ephemeral — only visible during active deployments
kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader
3. Verify the image is loaded on nodes:
# SSH into a node and check containerd
gcloud compute ssh NODE_NAME --zone=ZONE -- \
sudo ctr -n k8s.io images ls | grep crewai
4. Confirm crew pods are running with imagePullPolicy: Never:
kubectl get pods -n crewai-crews -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].imagePullPolicy}{"\n"}{end}'
# Should show "Never" for crew pods deployed via bucket mode
Secret Manager Integration (Optional)
GCP Secret Manager provides centralized secret management for CrewAI Platform.
Which Secrets to Store
Store in Secret Manager (sensitive, need rotation):
DB_PASSWORD - Database credentials (if not using IAM auth)
SECRET_KEY_BASE - Rails secret key
GITHUB_TOKEN - For private repository access
- Auth provider secrets (
ENTRA_ID_CLIENT_SECRET, OKTA_CLIENT_SECRET, etc.)
Keep in values.yaml (configuration, not secrets):
DB_HOST, DB_PORT, DB_USER, POSTGRES_DB
GCS_PROJECT_ID, GCS_BUCKET
APPLICATION_HOST, AUTH_PROVIDER
External Secrets Operator Setup
CrewAI uses External Secrets Operator (ESO) to sync secrets from Secret Manager to Kubernetes.
Install ESO (if not already installed):
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets \
external-secrets/external-secrets \
--namespace external-secrets-operator \
--create-namespace
Helm Configuration for Secret Manager
secretStore:
enabled: true
provider: "gcp"
gcp:
projectID: "your-gcp-project"
auth:
workloadIdentity:
enabled: true
clusterLocation: "us-central1"
clusterName: "your-cluster-name"
serviceAccount: "crewai-sa"
externalSecret:
enabled: true
secretStoreKind: SecretStore
secretStore: "crewai-secret-store"
secretPath: "crewai-platform" # Secret name in Secret Manager
databaseSecretPath: "crewai-db-password" # Separate secret for DB password
External Access (Gateway API or Ingress)
GKE provides built-in support for the Kubernetes Gateway API, which is the recommended way to expose services externally.
The NGINX Ingress Controller was retired in March 2026. For new GKE deployments, Gateway API is recommended over traditional Ingress resources. Existing Ingress configurations continue to work.
Option 1: Gateway API (Recommended)
GKE ships with built-in GatewayClass resources — no additional controller installation is needed, but Gateway API support must be enabled on the cluster.
Enable Gateway API on GKE
gcloud container clusters update YOUR_CLUSTER_NAME \
--gateway-api=standard \
--region=$GCP_REGION \
--project=$GCP_PROJECT_ID
Verify the GatewayClass resources are available (this may take a minute to propagate):
You should see the following classes:
| GatewayClass | Type | Use Case |
|---|
gke-l7-global-external-managed | Global external ALB | Multi-region, CDN, global anycast IP |
gke-l7-regional-external-managed | Regional external ALB | Single-region, lower latency |
gke-l7-rilb | Regional internal ALB | Internal-only access within VPC |
If kubectl get gatewayclass returns “No resources found”, the Gateway API CRDs are not installed. The Helm chart will fail with no matches for kind "Gateway". Run the gcloud container clusters update command above and wait for it to complete before deploying.
Helm Configuration
# Gateway API configuration
gateway:
enabled: true
create: true
gatewayClassName: gke-l7-global-external-managed
listeners:
- name: https
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- name: crewai-tls
- name: http
protocol: HTTP
port: 80
web:
gateway:
enabled: true
hostnames:
- "crewai.your-company.com"
# If OAuth is enabled (shared hostname with /oauthsvc prefix)
oauth:
enabled: true
gateway:
enabled: true
pathPrefix: "/oauthsvc"
If OAuth requires a dedicated hostname (e.g., because GKE does not support NGINX-style regex path rewriting), set pathPrefix: "/" and specify the hostname:
oauth:
enabled: true
gateway:
enabled: true
hostname: "oauth.your-company.com"
pathPrefix: "/"
When using a dedicated OAuth hostname, add it to the Gateway TLS certificate (or create a separate certificate map entry) and update DNS to point to the same Gateway IP.
To use a GCP-managed certificate instead of a Kubernetes TLS secret, add the annotation:
gateway:
enabled: true
create: true
gatewayClassName: gke-l7-global-external-managed
annotations:
networking.gke.io/certmap: "crewai-cert-map"
listeners:
- name: https
protocol: HTTPS
port: 443
To create the certificate map:
# Create a managed certificate
gcloud certificate-manager certificates create crewai-cert \
--domains="crewai.your-company.com"
# Create a certificate map and entry
gcloud certificate-manager maps create crewai-cert-map
gcloud certificate-manager maps entries create crewai-cert-entry \
--map=crewai-cert-map \
--certificates=crewai-cert \
--hostname="crewai.your-company.com"
If you already have a Gateway resource (e.g., shared across multiple apps), reference it instead of creating one:
gateway:
enabled: true
create: false
name: "shared-gateway"
namespace: "gateway-infra"
web:
gateway:
enabled: true
After deploying, get the load balancer IP:
kubectl get gateway -n $GKE_NAMESPACE
# Note the ADDRESS column — update your DNS record to point to it
The Helm chart automatically creates a GKE HealthCheckPolicy that configures the load balancer to use /health as the health check path. Without this, GKE’s health probes use the pod IP as the Host header, which Rails’ HostAuthorization middleware blocks — causing unconditional drop overload errors. This is handled automatically; no manual configuration is needed.
Option 2: GCE Ingress (Native GKE)
If you prefer traditional Ingress resources:
web:
ingress:
enabled: true
className: gce
host: "crewai.your-company.com"
annotations:
kubernetes.io/ingress.global-static-ip-name: "crewai-ip"
networking.gke.io/managed-certificates: "crewai-cert"
Option 3: NGINX Ingress Controller (Deprecated)
The NGINX Ingress Controller was retired in March 2026. Consider migrating to Gateway API for new deployments.
web:
ingress:
enabled: true
className: nginx
host: "crewai.your-company.com"
nginx:
tls:
enabled: true
secretName: "crewai-tls"
Complete GCP Deployment Example
Here is a complete production configuration for GCP:
# values-gcp-production.yaml
# ServiceAccount with Workload Identity annotation
serviceAccount:
annotations:
iam.gke.io/gcp-service-account: crewai-platform@your-project.iam.gserviceaccount.com
# Namespace for crew workloads (default: crewai-crews)
crewNamespace: "crewai-crews"
# Image pull credentials — required for pulling platform images from images.crewai.com.
# When installing via direct Helm (not KOTS), Replicated proxy credentials must be
# provided here. The credentials are the same used for `helm registry login`.
image:
registries:
- host: "images.crewai.com"
username: "your-email@company.com"
password: "your-replicated-license-token"
# Disable internal services (use GCP managed services)
postgres:
enabled: false
minio:
enabled: false
# Cloud SQL Auth Proxy
cloudSqlProxy:
enabled: true
instanceConnectionName: "your-project:us-central1:crewai-db"
port: 5432
privateIp: true
autoIamAuthn: false # Set true if using IAM database authentication
envVars:
# Database (via Cloud SQL Auth Proxy)
DB_HOST: "127.0.0.1"
DB_PORT: "5432"
DB_USER: "crewai"
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"
# GCS for object storage
STORAGE_SERVICE: "google"
GCS_PROJECT_ID: "your-project"
GCS_BUCKET: "crewai-prod-storage"
GCS_IAM_SIGNING: "true"
# Artifact Registry for crew images
CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
# Application
APPLICATION_HOST: "crewai.your-company.com"
AUTH_PROVIDER: "entra_id"
RAILS_ENV: "production"
RAILS_LOG_LEVEL: "info"
# DB_PASSWORD is not needed when using Cloud SQL IAM auth (autoIamAuthn: true).
# For password-based auth, set it in secrets:
# secrets:
# DB_PASSWORD: "your-secure-password"
# Gateway API (recommended over NGINX Ingress)
gateway:
enabled: true
create: true
gatewayClassName: gke-l7-global-external-managed
annotations:
networking.gke.io/certmap: "crewai-cert-map"
listeners:
- name: https
protocol: HTTPS
port: 443
- name: http
protocol: HTTP
port: 80
# Web
web:
replicaCount: 3
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
gateway:
enabled: true
hostnames:
- "crewai.your-company.com"
# Worker
worker:
replicaCount: 3
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
# OAuth (if using built-in integrations)
oauth:
enabled: true
gateway:
enabled: true
pathPrefix: "/oauthsvc"
# BuildKit
buildkit:
enabled: true
replicaCount: 1
resources:
requests:
cpu: "500m"
memory: "2Gi"
limits:
cpu: "4"
memory: "8Gi"
# RBAC
rbac:
create: true
The image.registries section is required when installing via direct Helm (helm install ... oci://registry.crewai.com/...). It provides credentials for pulling platform images (busybox, redis, buildkit, etc.) from the Replicated proxy at images.crewai.com. Use the same email and license token you used for helm registry login registry.crewai.com.When installing via Replicated KOTS, these credentials are injected automatically and image.registries is not needed.
Deploy:
helm install crewai-platform \
oci://registry.crewai.com/crewai/stable/crewai-platform \
--values values-gcp-production.yaml \
--namespace crewai \
--create-namespace
Post-install: After the first install, annotate the crews namespace ServiceAccount for Workload Identity (see Step 3):
kubectl annotate serviceaccount default -n crewai-crews --overwrite \
iam.gke.io/gcp-service-account=crewai-platform@your-project.iam.gserviceaccount.com
Bucket Deployment Variant
To use Bucket Deployment instead of Artifact Registry, modify the envVars section in the example above:
envVars:
# Database (via Cloud SQL Auth Proxy)
DB_HOST: "127.0.0.1"
DB_PORT: "5432"
DB_USER: "crewai-platform@your-project.iam"
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"
# GCS for object storage
STORAGE_SERVICE: "google"
GCS_PROJECT_ID: "your-project"
GCS_BUCKET: "crewai-prod-storage"
GCS_IAM_SIGNING: "true"
# Bucket Deployment (registry-less crew image delivery)
PROVIDER: BUCKET_BUILDKIT_KUBERNETES
IMAGE_BUCKET_NAME: "crewai-crew-images"
IMAGE_BUCKET_PREFIX: "crew-images"
# Image reference name (still required — used as containerd image tag)
CREW_IMAGE_REGISTRY_OVERRIDE: "us-central1-docker.pkg.dev/your-project/crewai"
# Application
APPLICATION_HOST: "crewai.your-company.com"
AUTH_PROVIDER: "entra_id"
RAILS_ENV: "production"
RAILS_LOG_LEVEL: "info"
When using Bucket Deployment, Artifact Registry IAM roles (roles/artifactregistry.writer, roles/artifactregistry.reader) are not needed. The roles/storage.objectAdmin role on the image bucket handles both upload (from build pods) and download (from preloader pods). The Crews namespace Workload Identity binding is still required — build pods use it to obtain GCS access tokens for uploading image tarballs.
Troubleshooting GCP-Specific Issues
Workload Identity Not Working
Symptoms: Pods get 403 Forbidden or could not retrieve default credentials errors.
Verify Workload Identity is enabled on the cluster:
gcloud container clusters describe YOUR_CLUSTER \
--region=$GCP_REGION \
--format='value(workloadIdentityConfig.workloadPool)'
# Should output: GCP_PROJECT_ID.svc.id.goog
Check ServiceAccount annotation:
kubectl get serviceaccount $KSA_NAME -n $GKE_NAMESPACE -o yaml | grep gcp-service-account
# Should show: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com
Test from a pod:
kubectl run -it --rm wi-test \
--namespace=$GKE_NAMESPACE \
--image=google/cloud-sdk:slim \
--serviceaccount=$KSA_NAME \
--restart=Never -- \
gcloud auth print-access-token
If this fails, verify the IAM binding:
gcloud iam service-accounts get-iam-policy \
${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com
# Should list the workloadIdentityUser binding for your KSA
GCS Signed URL Errors
Symptoms: Logs show Google::Cloud::Storage::SignedUrlUnavailable: Service account credentials 'issuer (client_email)' is missing when deploying crews.
This happens when Workload Identity is used but IAM-based URL signing is not enabled. The google-cloud-storage gem cannot sign URLs without a private key; with Workload Identity, the IAM signBlob API must be used instead.
Fix:
- Ensure
GCS_IAM_SIGNING: "true" is set in envVars
- Grant the
roles/iam.serviceAccountTokenCreator role to the GSA:
gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
--member="serviceAccount:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountTokenCreator"
- Redeploy the Helm chart
GCS Access Denied
Symptoms: Logs show Google::Cloud::PermissionDeniedError for storage operations.
# Verify the GSA has storage permissions
gcloud projects get-iam-policy $GCP_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--format="table(bindings.role)"
# Test from pod
kubectl exec -it deploy/crewai-web -n $GKE_NAMESPACE -- \
ruby -e "require 'google/cloud/storage'; puts Google::Cloud::Storage.new.buckets.map(&:name)"
Cloud SQL Connection Failures
Symptoms: Pods show connection refused on 127.0.0.1:5432.
Check the Cloud SQL Proxy sidecar is running:
kubectl get pods -n $GKE_NAMESPACE -l app.kubernetes.io/component=web -o jsonpath='{.items[0].status.containerStatuses[*].name}'
# Should include: cloud-sql-proxy
kubectl logs deploy/crewai-web -n $GKE_NAMESPACE -c cloud-sql-proxy
Verify the instance connection name:
gcloud sql instances describe $SQL_INSTANCE --format='value(connectionName)'
# Should match cloudSqlProxy.instanceConnectionName in your values
Cloud SQL IAM Authentication Failures
Symptoms: fe_sendauth: no password supplied or issue connecting with your username/password.
“no password supplied” — autoIamAuthn is not enabled in your Helm values:
cloudSqlProxy:
autoIamAuthn: true # Must be true for IAM auth
“issue connecting with your username/password” — the IAM token is being injected but rejected. Check these in order:
- IAM authentication flag on the instance (most common miss — off by default):
gcloud sql instances describe $SQL_INSTANCE \
--format='value(settings.databaseFlags)'
# Must include: cloudsql.iam_authentication=on
# If missing, enable it:
gcloud sql instances patch $SQL_INSTANCE \
--database-flags=cloudsql.iam_authentication=on
- IAM database user exists:
gcloud sql users list --instance=$SQL_INSTANCE --format="table(name,type)"
# Must show: GSA_NAME@GCP_PROJECT_ID.iam CLOUD_IAM_SERVICE_ACCOUNT
# If missing:
gcloud sql users create ${GSA_NAME}@${GCP_PROJECT_ID}.iam \
--instance=$SQL_INSTANCE \
--type=CLOUD_IAM_SERVICE_ACCOUNT
- GSA has required IAM roles:
gcloud projects get-iam-policy $GCP_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:${GSA_NAME}@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
--format="table(bindings.role)"
# Must include: roles/cloudsql.client AND roles/cloudsql.instanceUser
- DB_USER is in IAM format (in Helm values):
envVars:
DB_USER: "crewai-platform@your-project.iam" # NOT "crewai"
- SQL privileges granted — the IAM user must have been granted access to the databases via
GRANT ALL PRIVILEGES (see “Create Databases and User” section above)
Cloud SQL IAM: Table Ownership Errors During Upgrades
Symptoms: Rails migrations fail with PG::InsufficientPrivilege: ERROR: must be owner of table <table_name>.
Root cause: PostgreSQL DDL operations (ALTER TABLE, DROP COLUMN, ADD INDEX, etc.) require the executing user to be the owner of the table. When using IAM authentication, the IAM database user typically does not own tables that were created by the postgres superuser or a previous password-based user. GRANT ALL PRIVILEGES grants read/write access but does not transfer ownership.
Fix: Connect as the postgres superuser and transfer ownership of all tables, sequences, and views to the IAM user. See Step 5 of the IAM authentication setup for the exact SQL commands.
Prevention: Run the ownership transfer SQL during initial IAM auth setup (Step 5), before the first Helm upgrade.
Artifact Registry Push Failures
Symptoms: Crew deployments fail with unauthorized or denied during image push.
Verify the GSA has Artifact Registry write access:
gcloud artifacts repositories get-iam-policy $AR_REPO \
--location=$GCP_REGION \
--format="table(bindings.role, bindings.members)"
Check BuildKit pod logs:
# Find the most recent buildkit build pod
kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp
# Check logs for GAR auth
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep "GAR authentication"
Check Workload Identity annotation on the crews namespace default SA (build pods use this SA):
kubectl get serviceaccount default -n crewai-crews -o jsonpath='{.metadata.annotations}'
# Should include: iam.gke.io/gcp-service-account: GSA@PROJECT.iam.gserviceaccount.com
If missing, annotate it (see Step 3).
Artifact Registry Pull Failures (Crew Pods)
Symptoms: Crew pods in crewai-crews namespace show ImagePullBackOff or ErrImagePull with 403 Forbidden when pulling from *-docker.pkg.dev.
This can have multiple causes:
1. Wrong pull secret name: The crew pod references a pull secret that doesn’t exist or has empty credentials.
# Check which secret the crew pod references
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.imagePullSecrets}'
# Verify the secret exists and has data
kubectl get secret SECRET_NAME -n crewai-crews -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
If the decoded config shows {"auths":{}} (empty), the Helm chart’s image.registries may not be configured. See the Complete Example.
2. Node pool compute SA lacks GAR read access:
# Verify node pool OAuth scopes (should include cloud-platform)
gcloud container node-pools describe default-pool \
--cluster YOUR_CLUSTER \
--region $GCP_REGION \
--format='value(config.oauthScopes)'
# Check if the compute SA has artifactregistry.reader
GCE_SA="$(gcloud projects describe $GCP_PROJECT_ID --format='value(projectNumber)')-compute@developer.gserviceaccount.com"
gcloud projects get-iam-policy $GCP_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:${GCE_SA} AND bindings.role:roles/artifactregistry" \
--format="table(bindings.role)"
If the role is missing, grant it (see Step 2).
3. Verify the image exists in GAR:
gcloud artifacts docker images list \
${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/crewai/crewai-enterprise \
--include-tags
Build Pod Image Pull Failures
Symptoms: Build pods in the platform namespace show ErrImagePull for images from images.crewai.com.
This means the registry pull secret lacks Replicated proxy credentials. Verify:
# Check the pull secret contents
kubectl get secret $(kubectl get pod POD_NAME -n $GKE_NAMESPACE -o jsonpath='{.spec.imagePullSecrets[0].name}') \
-n $GKE_NAMESPACE -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | python3 -m json.tool
If images.crewai.com is not in the auths section, add Replicated proxy credentials to image.registries in your Helm values and redeploy (see the Complete Example).
Secret Manager Access Denied
Symptoms: ExternalSecret shows SecretSyncedError.
kubectl get externalsecret -n $GKE_NAMESPACE
kubectl describe externalsecret crewai-external-secret -n $GKE_NAMESPACE
kubectl get secretstore -n $GKE_NAMESPACE
kubectl describe secretstore crewai-secret-store -n $GKE_NAMESPACE
Bucket Deployment: Upload Failed
Symptoms: Crew deployment fails during the build phase with Bucket upload failed in the build pod logs.
Check the build pod logs:
kubectl get pods -n $GKE_NAMESPACE -l app=buildkit-build --sort-by=.metadata.creationTimestamp
kubectl logs POD_NAME -n $GKE_NAMESPACE -c buildkit-client | grep -A5 "bucket"
Common causes:
-
Missing
IMAGE_BUCKET_NAME — The environment variable is not set. Verify it appears in your Helm values under envVars.
-
Bucket does not exist — Verify the bucket exists:
gcloud storage buckets describe gs://$IMAGE_BUCKET
- Missing IAM permissions — The Workload Identity GSA needs
roles/storage.objectAdmin on the image bucket:
gcloud storage buckets get-iam-policy gs://$IMAGE_BUCKET \
--format="table(bindings.role, bindings.members)" \
| grep $GSA_NAME
- Metadata server unreachable — Build pods obtain access tokens from the GCE metadata server. If the logs show
Failed to get access token from metadata server, verify that Workload Identity is bound for the crews namespace (see Step 3).
Bucket Deployment: Preloader Timeout
Symptoms: Crew deployment hangs at the preload phase. The preloader DaemonSet pods are not reaching Ready state.
Check the preloader DaemonSet and pod status:
kubectl get daemonsets -n $GKE_NAMESPACE -l app=image-preloader
kubectl get pods -n $GKE_NAMESPACE -l app=image-preloader -o wide
kubectl logs -n $GKE_NAMESPACE -l app=image-preloader --tail=50
Common causes:
- Image tarball not found in bucket — The preloader downloads from the bucket. If the build phase failed silently or the object name doesn’t match:
gcloud storage ls gs://$IMAGE_BUCKET/crew-images/
-
GCS download permission denied — The preloader pods run in the platform namespace and use the platform KSA’s Workload Identity. Verify the GSA has read access to the image bucket.
-
containerd socket not accessible — The preloader mounts the host’s containerd socket. If the node uses a non-standard socket path, set
CONTAINERD_SOCKET_PATH in your Helm values. Check the default path:
# SSH into a node
gcloud compute ssh NODE_NAME --zone=ZONE -- ls -la /run/containerd/containerd.sock
ctr binary not found — The preloader uses the host’s ctr binary from /usr/bin. If it’s in a different location on your nodes, set CTR_HOST_PATH in your Helm values:
gcloud compute ssh NODE_NAME --zone=ZONE -- which ctr
Bucket Deployment: ErrImageNeverPull
Symptoms: Crew pods show ErrImageNeverPull status after deployment.
This means Kubernetes is set to imagePullPolicy: Never but the image is not present in the node’s containerd store. This can happen if:
-
Preloader didn’t complete on this node — The node may have been added to the cluster after the preloader DaemonSet ran. Redeploy the crew to trigger a fresh preload cycle.
-
containerd image was garbage collected — containerd may have cleaned up unused images. Redeploy the crew.
-
Image name mismatch — The
CREW_IMAGE_REGISTRY_OVERRIDE value must be consistent across build, preload, and deploy. Verify:
# Check what image the pod is trying to use
kubectl get pod POD_NAME -n crewai-crews -o jsonpath='{.spec.containers[0].image}'
# Check what images are loaded on the node
gcloud compute ssh NODE_NAME --zone=ZONE -- \
sudo ctr -n k8s.io images ls | grep crewai
The image name from the pod spec must exactly match an image in the ctr output.