Overview
This guide walks through a complete, production-ready deployment of CrewAI Enterprise on Amazon EKS with the following stack:
- Database: Amazon Aurora for PostgreSQL 16 (external RDS)
- Storage: Amazon S3
- Container registry: Amazon ECR
- Load balancer: AWS Application Load Balancer (ALB) via AWS Load Balancer Controller
- IAM: EKS Pod Identity (no static credentials)
- Authentication: WorkOS (AuthKit)
- Trace collection: Wharf (OTLP)
- Studio V2: Post-install activation
The values.yaml in this guide is self-contained and deployable. No cross-references to other guides are required to complete this installation.
Prerequisites Checklist
Complete every item before running helm install. Missing any one will cause a failed or broken deployment.
EKS Cluster
Amazon RDS
Amazon S3
Amazon ECR
IAM — Pod Identity
ACM Certificate
WorkOS
Helm Registry Access
Infrastructure Setup
Aurora Instance Sizing
| Deployment Size | Instance Class | vCPU | RAM |
|---|
| Development | db.t3.medium | 2 | 4 GiB |
| Small Production | db.r6g.large | 2 | 16 GiB |
| Medium Production | db.r6g.xlarge | 4 | 32 GiB |
| Large Production | db.r6g.2xlarge | 8 | 64 GiB |
Use gp3 storage with a minimum of 3000 IOPS for production. Memory-optimized instances (R6g family) are strongly preferred for CrewAI’s Rails workload.
Database Setup
Connect to your RDS instance as a superuser and run the following before helm install. The Helm chart does not create databases when postgres.enabled: false.
-- Create all four required databases
CREATE DATABASE crewai_plus_production;
CREATE DATABASE crewai_plus_cable_production;
CREATE DATABASE crewai_plus_oauth_production;
CREATE DATABASE wharf;
-- Grant the crewai user full access to each database
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE wharf TO crewai;
The OAuth database name must be crewai_plus_oauth_production. The chart default for POSTGRES_OAUTH_DB is oauth_db. If you do not override it in envVars, the application will attempt to connect to a database named oauth_db which does not exist, causing authentication failures.The Wharf database name must be wharf. This matches postgres.wharfDatabase chart default and requires no override.
S3 Bucket
aws s3api create-bucket \
--bucket <YOUR_S3_BUCKET> \
--region <AWS_REGION> \
--create-bucket-configuration LocationConstraint=<AWS_REGION>
aws s3api put-bucket-versioning \
--bucket <YOUR_S3_BUCKET> \
--versioning-configuration Status=Enabled
aws s3api put-bucket-encryption \
--bucket <YOUR_S3_BUCKET> \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}
}]
}'
ECR Repository
# Repository name must end in /crewai-enterprise
aws ecr create-repository \
--repository-name <YOUR_ORG>/crewai-enterprise \
--region <AWS_REGION> \
--image-scanning-configuration scanOnPush=true
# Tags MUST be mutable — CrewAI overwrites tags across crew versions
aws ecr put-image-tag-mutability \
--repository-name <YOUR_ORG>/crewai-enterprise \
--image-tag-mutability MUTABLE \
--region <AWS_REGION>
Do NOT include /crewai-enterprise in CREW_IMAGE_REGISTRY_OVERRIDE. The platform appends this suffix automatically. Set CREW_IMAGE_REGISTRY_OVERRIDE to the prefix only — for example <account>.dkr.ecr.<region>.amazonaws.com/<org>. Including the suffix results in push failures to a double-suffixed path.
IAM — Pod Identity Setup
Create a single IAM role that covers both S3 and ECR access. Pod Identity attaches this role to the crewai-sa service account without OIDC configuration.
1. Create the combined S3 + ECR policy
Save the following to crewai-policy.json, substituting your bucket name, region, and account ID:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3Access",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<YOUR_S3_BUCKET>",
"arn:aws:s3:::<YOUR_S3_BUCKET>/*"
]
},
{
"Sid": "ECRAuthToken",
"Effect": "Allow",
"Action": ["ecr:GetAuthorizationToken"],
"Resource": "*"
},
{
"Sid": "ECRPushPull",
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload"
],
"Resource": "arn:aws:ecr:<AWS_REGION>:<ACCOUNT_ID>:repository/*/crewai-enterprise"
}
]
}
2. Create the IAM role and attach the policy
# Create role with Pod Identity trust policy
aws iam create-role \
--role-name CrewAIPodIdentityRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "pods.eks.amazonaws.com"},
"Action": ["sts:AssumeRole", "sts:TagSession"]
}]
}'
# Create and attach the policy
aws iam create-policy \
--policy-name CrewAIPlatformAccess \
--policy-document file://crewai-policy.json
aws iam attach-role-policy \
--role-name CrewAIPodIdentityRole \
--policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/CrewAIPlatformAccess
3. Create the Pod Identity association
# Run after the crewai namespace exists, OR before helm install —
# the association is valid as soon as the service account exists.
aws eks create-pod-identity-association \
--cluster-name <YOUR_CLUSTER> \
--namespace crewai \
--service-account crewai-sa \
--role-arn arn:aws:iam::<ACCOUNT_ID>:role/CrewAIPodIdentityRole
rbac.create: true in the Helm values (shown below) causes the chart to automatically create a ServiceAccount named crewai-sa in the deployment namespace. The Pod Identity association must reference this exact name.
ACM Certificate
aws acm request-certificate \
--domain-name <YOUR_DOMAIN> \
--validation-method DNS \
--region <AWS_REGION>
# Note the certificate ARN — you will need it in the values.yaml
WorkOS Setup
1. Create a WorkOS Application
In the WorkOS Dashboard:
- Create a new Application
- Note the Client ID (format:
client_<...>)
- Note the AuthKit domain (format:
https://<subdomain>.authkit.app)
- Navigate to Redirects and add:
https://<YOUR_DOMAIN>/auth/workos/callback
- Generate an API key (format:
sk_live_...)
2. Generate WORKOS_COOKIE_PASSWORD
The cookie password must be 32 characters or fewer. Longer values are silently truncated by the runtime, which can produce intermittent auth failures.
openssl rand -base64 32 | cut -c -32
Store both WORKOS_API_KEY and the generated cookie password. You will place them under envVars: in the values file.
Chart bug — WORKOS_API_KEY and WORKOS_COOKIE_PASSWORD must be placed under envVars:, not secrets:.If placed under secrets:, these values are silently absent from pods — authentication fails with no clear error message, and the pods start normally with no indication of misconfiguration. This is a known chart limitation. Both values must appear under envVars: as shown in the complete values.yaml below.
Complete values.yaml
Replace every <PLACEHOLDER> with your environment-specific values before running helm install.
# ──────────────────────────────────────────────────────────────────────────────
# CrewAI Enterprise — AWS EKS + WorkOS + Wharf + Studio V2
# ──────────────────────────────────────────────────────────────────────────────
# Disable bundled PostgreSQL — using Amazon RDS Aurora
postgres:
enabled: false
# Disable bundled MinIO — using Amazon S3
minio:
enabled: false
# ── Database ─────────────────────────────────────────────────────────────────
envVars:
# RDS connection
DB_HOST: "<RDS_CLUSTER_ENDPOINT>" # e.g., crewai-prod.cluster-abc123.us-east-1.rds.amazonaws.com
DB_PORT: "5432"
DB_USER: "crewai"
DB_POOL: "5"
RAILS_MAX_THREADS: "5"
# Database names — must match exactly what you created in SQL above
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production" # Chart default is "oauth_db" — MUST override
# ── S3 Storage ──────────────────────────────────────────────────────────────
STORAGE_SERVICE: "amazon"
AWS_REGION: "<AWS_REGION>" # e.g., us-east-1
AWS_BUCKET: "<YOUR_S3_BUCKET>"
# ── ECR Crew Image Registry ──────────────────────────────────────────────────
# Set to the registry prefix WITHOUT the /crewai-enterprise suffix.
# The platform appends /crewai-enterprise automatically.
# Example: if your repo is 123456789012.dkr.ecr.us-east-1.amazonaws.com/myorg/crewai-enterprise
# then set: 123456789012.dkr.ecr.us-east-1.amazonaws.com/myorg
CREW_IMAGE_REGISTRY_OVERRIDE: "<ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<YOUR_ORG>"
# ── Application ────────────────────────────────────────────────────────────
APPLICATION_HOST: "<YOUR_DOMAIN>" # e.g., crewai.company.com
RAILS_LOG_LEVEL: "info"
# ── WorkOS Authentication ───────────────────────────────────────────────────
# CRITICAL: WORKOS_API_KEY and WORKOS_COOKIE_PASSWORD must be under envVars:
# Placing them under secrets: causes them to be silently absent from pods,
# resulting in authentication failures with no clear error.
AUTH_PROVIDER: "workos"
WORKOS_CLIENT_ID: "<WORKOS_CLIENT_ID>" # WorkOS Application Client ID
WORKOS_API_KEY: "<WORKOS_API_KEY>" # sk_live_... — under envVars: NOT secrets:
WORKOS_COOKIE_PASSWORD: "<WORKOS_COOKIE_PASSWORD>" # ≤32 chars — under envVars: NOT secrets:
# ── Secrets ───────────────────────────────────────────────────────────────────
secrets:
DB_PASSWORD: "<RDS_DB_PASSWORD>"
SECRET_KEY_BASE: "<SECRET_KEY_BASE>" # Generate: openssl rand -hex 64
# ── Wharf (OTLP trace collection) ─────────────────────────────────────────────
wharf:
enabled: true # true is the chart default; explicit for clarity
# Wharf uses the same DB host/port/user/password as the main app.
# postgres.wharfDatabase defaults to "wharf" — matches the database we pre-created above.
# ── Web Application ───────────────────────────────────────────────────────────
web:
replicaCount: 2
# REQUIRED when using ALB — the chart default is true (SSL from Puma).
# ALB terminates TLS and forwards plain HTTP to backend pods.
# Leaving this true causes 502 Bad Gateway on every request.
enableSslFromPuma: false
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
ingress:
enabled: true
className: "alb"
host: "<YOUR_DOMAIN>"
annotations:
alb.ingress.kubernetes.io/ssl-redirect: "443"
alb.ingress.kubernetes.io/target-group-attributes: "idle_timeout.timeout_seconds=300"
alb.ingress.kubernetes.io/healthcheck-path: "/up"
alb:
scheme: "internet-facing" # lowercase only — capital letters silently fail
targetType: "ip"
certificateArn: "arn:aws:acm:<AWS_REGION>:<ACCOUNT_ID>:certificate/<CERTIFICATE_ID>"
sslPolicy: "ELBSecurityPolicy-TLS13-1-2-2021-06"
# ── Background Workers ────────────────────────────────────────────────────────
worker:
replicaCount: 2
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
# ── BuildKit (crew image builds) ──────────────────────────────────────────────
buildkit:
enabled: true
replicaCount: 1
resources:
requests:
cpu: "500m"
memory: "2Gi"
limits:
cpu: "4"
memory: "8Gi"
# ── RBAC ─────────────────────────────────────────────────────────────────────
# Creates ServiceAccount "crewai-sa" — must match the Pod Identity association
rbac:
create: true
Install
Run the following command from the directory containing your values.yaml:
helm install crewai-platform \
oci://registry.crewai.com/crewai/stable/crewai-platform \
--values values.yaml \
--namespace crewai \
--create-namespace
Wait for all pods to reach Running state before proceeding to post-install:
kubectl get pods -n crewai --watch
Post-Install
Required Initialization
These commands must be run before any user can log in. Run them in the order shown.
# 1. Bootstrap the internal CrewAI organization
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails studio:install_internal_organization
# 2. Create default permission roles
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails factory:setup_permissions_defaults
# 3. Add your admin user as owner of org 2 (the first customer-facing org)
# Replace admin@company.com with your actual admin email
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails 'factory:add_owner[2,admin@company.com]'
# 4. Grant admin panel access (WorkOS — use factory:grant_admin, not App Roles)
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails 'factory:grant_admin[admin@company.com]'
Organization IDs are sequential integers. The internal CrewAI organization always receives ID 1. The first customer-facing organization created in the UI receives ID 2. To list all organizations: kubectl exec -it deploy/crewai-web -n crewai -- bin/rails runner "puts Organization.all.map { |o| \"#{o.id}: #{o.name}\" }.join(\"\\n\")"
WorkOS users must log in before factory:add_owner and factory:grant_admin can reference their user record. With SSO providers, the user record is created on first login. Run studio:install_internal_organization and factory:setup_permissions_defaults first, then have the admin user log in via WorkOS, then run factory:add_owner and factory:grant_admin.
Studio V2 Activation
Studio V2 cannot be configured in values.yaml. Adding studioV2.enabled: true or STUDIO_V2_ENABLED to your values file has no effect — Helm silently ignores unknown keys. Setup is always performed post-install through the following UI and kubectl steps.
Studio V2 requires three ordered steps: two in the web UI and three kubectl commands. All UI steps must be completed before running the kubectl commands.
Step 1 — Create the LLM Connection (UI)
- Log in as an admin user
- Navigate to Settings → LLM Connections
- Click New Connection
- Name the connection exactly
studio-v2 (case-sensitive, with hyphen)
- Configure the LLM provider and model for the Studio agent
- Save the connection
Step 2 — Set as Default Connection (UI)
- Navigate to Settings → Crew Studio
- Set
studio-v2 as the Default Connection
- Save
Step 3 — Run activation commands (kubectl)
Run these three commands in order. studio:agent:install will fail if the studio-v2 LLM Connection does not exist yet.
# Install the Studio agent (requires studio-v2 LLM Connection to exist)
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails studio:agent:install
# Sync and index built-in and enterprise tools
kubectl exec -it deploy/crewai-web -n crewai -- \
bin/rails studio:tools:sync_crewai_tools studio:tools:sync_enterprise_tools
# Install the Studio runner
kubectl exec -it deploy/crewai-web -n crewai -- bin/rails studio:runner:install
Verify
Pods and Services
# All pods should be Running with no restarts
kubectl get pods -n crewai
# Ingress should show an ADDRESS (ALB DNS hostname)
kubectl get ingress -n crewai
ALB Provisioning
# If the ingress ADDRESS is empty after 3–5 minutes, check LBC logs
kubectl logs -n kube-system deployment/aws-load-balancer-controller --tail=50
Common causes of ALB not provisioning:
scheme value is not lowercase — "internet-facing" and "internal" only; "Internet-Facing" silently fails
- Public subnets are missing the
kubernetes.io/role/elb=1 tag
- AWS Load Balancer Controller is not installed or lacks IAM permissions
Database Connectivity
kubectl exec -it deploy/crewai-web -n crewai -- \
bin/rails runner "puts ActiveRecord::Base.connection.execute('SELECT 1').first"
WorkOS Authentication
- Navigate to
https://<YOUR_DOMAIN> in a browser
- You should be redirected to the WorkOS AuthKit login page
- After login, you should land on the CrewAI dashboard
If authentication fails with no clear error, verify that WORKOS_API_KEY and WORKOS_COOKIE_PASSWORD are under envVars: and not secrets: in your values file.
# Confirm the env vars are present in the web pod
kubectl exec -it deploy/crewai-web -n crewai -- env | grep WORKOS
Pod Identity (S3 + ECR)
# Verify the pod can assume the IAM role via Pod Identity
kubectl exec -it deploy/crewai-web -n crewai -- aws sts get-caller-identity
# Verify S3 access
kubectl exec -it deploy/crewai-web -n crewai -- aws s3 ls s3://<YOUR_S3_BUCKET>/
# Verify Pod Identity agent is running
kubectl get daemonset eks-pod-identity-agent -n kube-system
# Verify the association exists
aws eks list-pod-identity-associations \
--cluster-name <YOUR_CLUSTER> \
--namespace crewai
TOKEN=$(kubectl get secret crewai-platform-secrets -n crewai \
-o jsonpath='{.data.FACTORY_DEBUG_TOKEN}' | base64 -d)
curl -H "X-Factory-Debug-Token: $TOKEN" \
https://<YOUR_DOMAIN>/health/debug
All components should report "status": "ok". For detailed diagnostics, see the Factory Health guide.