Overview
This guide covers a full production deployment of CrewAI Platform on AWS EKS using:
- Amazon RDS (PostgreSQL 16) for all databases including Wharf
- Amazon S3 for object storage
- Amazon ECR for crew container images
- AWS ALB for ingress with ACM TLS termination
- Microsoft Entra ID for SSO authentication
- Wharf for OTLP trace and span collection
- Studio V2 for the AI-powered crew builder (post-install)
This is a self-contained guide. All required configuration is included here — no cross-referencing other guides for values.
Prerequisites Checklist
Complete every item before running helm install.
AWS Infrastructure
| Component | Requirement |
|---|
| EKS cluster | Kubernetes 1.32.0+, AMD64 worker nodes |
| AWS Load Balancer Controller | Installed and running in kube-system |
| Amazon RDS | PostgreSQL 16, accessible from EKS worker nodes |
| Amazon S3 bucket | Created with versioning and encryption enabled |
| Amazon ECR repository | Ends in /crewai-enterprise, tag mutability MUTABLE |
| ACM certificate | Issued and validated for your domain |
| IAM role | Pod Identity association for crewai-sa with S3 + ECR permissions |
| Public/private subnets | Tagged for ALB discovery (kubernetes.io/role/elb=1) |
CrewAI Platform only supports AMD64 (x86_64) worker nodes. ARM64 (Graviton) worker nodes are not supported.
Microsoft Entra ID
| Step | Status |
|---|
| App registration created in Azure portal | |
Redirect URI configured: https://<YOUR_DOMAIN>/auth/entra_id/callback | |
| Application (client) ID collected | |
| Directory (tenant) ID collected | |
| Client secret created and value copied | |
Admin consent granted for Microsoft Graph User.Read | |
App Roles created: member and factory-admin | |
Admin user assigned factory-admin App Role | |
The redirect URI must be configured in Azure before running helm install. Authentication will fail silently if it is missing or uses the wrong path.
kubectl connected to your EKS cluster
helm 3.10+
- AWS CLI with credentials for your account
Infrastructure Setup
RDS: Pre-Create All Four Databases
Connect to your RDS instance as the postgres superuser and run:
-- Create the application user (if not already created)
CREATE USER crewai WITH PASSWORD '<DB_PASSWORD>';
-- Create all required databases
CREATE DATABASE crewai_plus_production;
CREATE DATABASE crewai_plus_cable_production;
CREATE DATABASE crewai_plus_oauth_production; -- Chart default is "oauth_db" — this override is required
CREATE DATABASE wharf; -- Used by Wharf for trace/span storage
-- Grant full access to the crewai user
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE wharf TO crewai;
The chart default for POSTGRES_OAUTH_DB is oauth_db. You must override it to crewai_plus_oauth_production to match the database you created above. A mismatch causes the OAuth service to fail on startup.
When postgres.enabled: false, the Helm chart does not create databases automatically. All four databases must exist before helm install runs.
S3: Create Bucket
aws s3api create-bucket \
--bucket <YOUR_S3_BUCKET> \
--region <AWS_REGION>
aws s3api put-bucket-versioning \
--bucket <YOUR_S3_BUCKET> \
--versioning-configuration Status=Enabled
aws s3api put-bucket-encryption \
--bucket <YOUR_S3_BUCKET> \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}
}]
}'
ECR: Create Repository
# Repository name MUST end in /crewai-enterprise
aws ecr create-repository \
--repository-name <YOUR_ORG>/crewai-enterprise \
--region <AWS_REGION> \
--image-scanning-configuration scanOnPush=true
# Tags MUST be mutable — CrewAI overwrites tags on each build
aws ecr put-image-tag-mutability \
--repository-name <YOUR_ORG>/crewai-enterprise \
--image-tag-mutability MUTABLE \
--region <AWS_REGION>
Do not set ECR image tag mutability to IMMUTABLE. CrewAI rewrites tags for crew versions — immutable tags cause build failures.
IAM: Pod Identity for S3 and ECR
Create a combined IAM policy for S3 and ECR access, then attach it to a role configured for Pod Identity.
Combined IAM policy (crewai-platform-policy.json):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3Access",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<YOUR_S3_BUCKET>",
"arn:aws:s3:::<YOUR_S3_BUCKET>/*"
]
},
{
"Sid": "ECRAuthToken",
"Effect": "Allow",
"Action": ["ecr:GetAuthorizationToken"],
"Resource": "*"
},
{
"Sid": "ECRPushPull",
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload"
],
"Resource": "arn:aws:ecr:<AWS_REGION>:<ACCOUNT_ID>:repository/*"
}
]
}
# Create policy
aws iam create-policy \
--policy-name CrewAIPlatformAccess \
--policy-document file://crewai-platform-policy.json
# Create IAM role for Pod Identity
aws iam create-role \
--role-name CrewAIPodIdentityRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "pods.eks.amazonaws.com"},
"Action": ["sts:AssumeRole", "sts:TagSession"]
}]
}'
# Attach policy to role
aws iam attach-role-policy \
--role-name CrewAIPodIdentityRole \
--policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/CrewAIPlatformAccess
# Create Pod Identity association — service account name is crewai-sa (created by rbac.create: true)
aws eks create-pod-identity-association \
--cluster-name <YOUR_CLUSTER> \
--namespace crewai \
--service-account crewai-sa \
--role-arn arn:aws:iam::<ACCOUNT_ID>:role/CrewAIPodIdentityRole
With rbac.create: true (the chart default), the Helm chart creates a ServiceAccount named crewai-sa. The Pod Identity association must reference this exact name and the namespace you deploy into.
ACM: Request Certificate
aws acm request-certificate \
--domain-name <YOUR_DOMAIN> \
--validation-method DNS \
--region <AWS_REGION>
# Save the certificate ARN — needed in values.yaml
Entra ID: Azure Portal Setup
Step 1: App Registration
- Go to portal.azure.com > Microsoft Entra ID > App registrations > New registration
- Name:
CrewAI (or your preferred name)
- Supported account types: Accounts in this organizational directory only
- Redirect URI: Web platform —
https://<YOUR_DOMAIN>/auth/entra_id/callback
- Click Register
The redirect URI path must be exactly /auth/entra_id/callback (lowercase, underscore). Configure this before running helm install — a missing or incorrect URI causes authentication to fail with no clear error.
Step 2: Collect Credentials
From the app overview page, copy:
- Application (client) ID →
ENTRA_ID_CLIENT_ID
- Directory (tenant) ID →
ENTRA_ID_TENANT_ID
Step 3: Create Client Secret
- Left sidebar > Manage > Certificates & secrets
- New client secret — enter a description, choose expiration
- Copy the Value immediately — it is not shown again
→
ENTRA_ID_CLIENT_SECRET
Step 4: Grant Admin Consent
- Enterprise applications > select your app
- Security > Permissions > Grant admin consent
- Confirm consent for Microsoft Graph User.Read
Step 5: Create App Roles
- Back in App registrations > your app > Manage > App roles
- Create two roles:
| Display Name | Value | Allowed Member Types |
|---|
| Member | member | Users/Groups |
| Factory Admin | factory-admin | Users/Groups |
Ensure “Do you want to enable this app role?” is checked for each.
Step 6: Assign Users
- Enterprise applications > your app > Manage > Properties
- Set Assignment required? to Yes, then Save
- Manage > Users and groups > Add user/group
- Regular users: assign Member role
- Admin users: assign Factory Admin role
For Entra ID deployments, admin access is granted exclusively through the Factory Admin App Role in Azure portal. Do NOT run factory:grant_admin — it writes to a database table that is not consulted for Entra ID users. The platform reads admin status from the JWT roles claim.
Complete values.yaml
Replace all <PLACEHOLDER> values before running helm install.
# ──────────────────────────────────────────────
# Disable bundled PostgreSQL — using Amazon RDS
# ──────────────────────────────────────────────
postgres:
enabled: false
# ──────────────────────────────────────────────
# Disable bundled MinIO — using Amazon S3
# ──────────────────────────────────────────────
minio:
enabled: false
# ──────────────────────────────────────────────
# Database: Amazon RDS
# ──────────────────────────────────────────────
envVars:
DB_HOST: "<YOUR_RDS_ENDPOINT>" # e.g. crewai.cluster-abc123.us-east-1.rds.amazonaws.com
DB_PORT: "5432"
DB_USER: "crewai"
POSTGRES_DB: "crewai_plus_production"
POSTGRES_CABLE_DB: "crewai_plus_cable_production"
POSTGRES_OAUTH_DB: "crewai_plus_oauth_production" # Chart default is "oauth_db" — must match what you created in RDS
# ──────────────────────────────────────────
# Object storage: Amazon S3
# Using Pod Identity — no static credentials needed
# ──────────────────────────────────────────
STORAGE_SERVICE: "amazon"
AWS_REGION: "<AWS_REGION>" # e.g. us-east-1
AWS_BUCKET: "<YOUR_S3_BUCKET>"
# ──────────────────────────────────────────
# Crew image registry: Amazon ECR
# Value is the registry prefix WITHOUT /crewai-enterprise
# The platform appends /crewai-enterprise automatically
# ──────────────────────────────────────────
CREW_IMAGE_REGISTRY_OVERRIDE: "<ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<YOUR_ORG>"
# Example: 123456789012.dkr.ecr.us-east-1.amazonaws.com/production
# Do NOT include /crewai-enterprise here — the platform appends it
# ──────────────────────────────────────────
# Authentication: Microsoft Entra ID
# CLIENT_ID and TENANT_ID are non-sensitive identifiers — they belong under envVars
# CLIENT_SECRET is a credential — it belongs under secrets (below)
# ──────────────────────────────────────────
AUTH_PROVIDER: "entra_id" # exact value — lowercase with underscore
ENTRA_ID_CLIENT_ID: "<APPLICATION_CLIENT_ID>" # from App Registration overview
ENTRA_ID_TENANT_ID: "<DIRECTORY_TENANT_ID>" # from App Registration overview
# ──────────────────────────────────────────
# Application
# ──────────────────────────────────────────
APPLICATION_HOST: "<YOUR_DOMAIN>" # e.g. crewai.company.com — no https:// prefix
RAILS_LOG_LEVEL: "info"
RAILS_MAX_THREADS: "5"
DB_POOL: "5"
# ──────────────────────────────────────────────
# Secrets
# ──────────────────────────────────────────────
secrets:
DB_PASSWORD: "<YOUR_DB_PASSWORD>"
SECRET_KEY_BASE: "<64-char-hex-string>" # openssl rand -hex 64
ENTRA_ID_CLIENT_SECRET: "<CLIENT_SECRET_VALUE>" # from Azure Certificates & Secrets
# ──────────────────────────────────────────────
# Wharf: OTLP trace and span collection
# wharf.enabled is the chart default (true) — included here for clarity
# Wharf shares the same DB host/port/user/password as the main app
# The wharf database must be pre-created in RDS (see SQL above)
# ──────────────────────────────────────────────
wharf:
enabled: true
postgres:
wharfDatabase: "wharf" # must match the database you created in RDS
# ──────────────────────────────────────────────
# Web: application pods
# ──────────────────────────────────────────────
web:
replicaCount: 2
enableSslFromPuma: false # REQUIRED when using ALB — chart default is true
# Omitting this causes 502 Bad Gateway on every request
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
ingress:
enabled: true
className: "alb"
host: "<YOUR_DOMAIN>" # must match APPLICATION_HOST and ACM certificate
annotations:
alb.ingress.kubernetes.io/ssl-redirect: "443"
alb.ingress.kubernetes.io/target-group-attributes: "idle_timeout.timeout_seconds=300"
alb.ingress.kubernetes.io/healthcheck-path: "/up"
alb:
scheme: "internet-facing" # lowercase only — capital letters silently fail to provision
targetType: "ip"
certificateArn: "arn:aws:acm:<AWS_REGION>:<ACCOUNT_ID>:certificate/<CERT_ID>"
sslPolicy: "ELBSecurityPolicy-TLS13-1-2-2021-06"
# ──────────────────────────────────────────────
# Worker: background job processing
# ──────────────────────────────────────────────
worker:
replicaCount: 2
resources:
requests:
cpu: "1000m"
memory: "6Gi"
limits:
cpu: "6"
memory: "12Gi"
# ──────────────────────────────────────────────
# BuildKit: crew container image builds
# ──────────────────────────────────────────────
buildkit:
enabled: true
replicaCount: 1
resources:
requests:
cpu: "500m"
memory: "2Gi"
limits:
cpu: "4"
memory: "8Gi"
# ──────────────────────────────────────────────
# RBAC: auto-creates ServiceAccount crewai-sa
# Required for Pod Identity — name must match the association created in AWS
# ──────────────────────────────────────────────
rbac:
create: true
web.enableSslFromPuma: false is required. The chart default is true. ALB terminates TLS and forwards plain HTTP to backend pods. With the default true, Puma expects HTTPS connections but receives HTTP from the ALB, causing 502 Bad Gateway errors and failed health checks on every request.
alb.scheme is case-sensitive. The AWS Load Balancer Controller only accepts lowercase values. Using "Internet-Facing" or "Internal" (capitalized) causes the ALB to silently fail to provision — the ingress will have no ADDRESS indefinitely.
CREW_IMAGE_REGISTRY_OVERRIDE must not include /crewai-enterprise. The platform appends this suffix automatically. If your ECR repository is 123456789012.dkr.ecr.us-east-1.amazonaws.com/production/crewai-enterprise, set only 123456789012.dkr.ecr.us-east-1.amazonaws.com/production. Including the suffix causes push failures to a double-suffixed path.
Install
helm install crewai-platform \
oci://registry.crewai.com/crewai/stable/crewai-platform \
--values values.yaml \
--namespace crewai \
--create-namespace
Wait for all pods to reach Running:
kubectl get pods -n crewai -w
Post-Install
Required Initialization
These commands must be completed before any user can log in. Run them in the order shown.
# 1. Initialize the internal CrewAI organization
kubectl exec -it deploy/crewai-web -- bin/rails studio:install_internal_organization
# 2. Set up default permission roles
kubectl exec -it deploy/crewai-web -- bin/rails factory:setup_permissions_defaults
# 3. Add your admin user as owner of org 2 (the first customer-facing org)
# Replace admin@company.com with the email address assigned the factory-admin App Role in Azure
kubectl exec -it deploy/crewai-web -- bin/rails 'factory:add_owner[2,admin@company.com]'
Do NOT run factory:grant_admin for Entra ID deployments. Admin panel access is controlled by the factory-admin App Role in Azure portal. The factory:grant_admin command writes to a database table that Entra ID authentication does not consult — it has no effect on Entra ID users and will not grant admin access.
For Entra ID, the user record is created in the database automatically on first login. The factory:add_owner command above can be run before or after the user’s first login.
Studio V2 Setup
Studio V2 cannot be configured in values.yaml. Adding studioV2.enabled or STUDIO_V2_ENABLED has no effect — Helm silently ignores unknown keys. Setup requires the platform to be fully running and accessible.
Step 1: Create the LLM Connection (UI)
- Log in to the CrewAI web UI as an admin
- Navigate to Settings → LLM Connections
- Click New Connection
- Set the name to exactly
studio-v2 (lowercase, no spaces)
- Select your LLM provider, enter the model name and API key
- Click Save
The connection name must be exactly studio-v2. The install commands in Step 3 look up this name specifically — a different name or capitalization causes them to fail silently.
Step 2: Set as Default Connection (UI)
- Navigate to Settings → Crew Studio
- Under Default Connection, select
studio-v2
- Click Save
Step 3: Run Install Commands (kubectl)
Run these commands in order. Each must complete successfully before running the next.
# 1. Install the Studio agent
# This command FAILS if the studio-v2 LLM Connection does not exist yet
kubectl exec -it deploy/crewai-web -- bin/rails studio:agent:install
# 2. Sync and index tools
kubectl exec -it deploy/crewai-web -- \
bin/rails studio:tools:sync_crewai_tools \
studio:tools:sync_enterprise_tools
# 3. Install the Studio runner
kubectl exec -it deploy/crewai-web -- bin/rails studio:runner:install
studio:agent:install will fail if the studio-v2 LLM Connection does not already exist in the UI. Complete Steps 1 and 2 before running any of these commands.
Verify
# All pods should be Running
kubectl get pods -n crewai
# Check web pod logs for errors
kubectl logs -l app.kubernetes.io/component=web --tail=50 -n crewai
# Check worker pod logs
kubectl logs -l app.kubernetes.io/component=worker --tail=50 -n crewai
ALB Ingress
# ADDRESS should show an ALB DNS name within 2–3 minutes
kubectl get ingress -n crewai
If ADDRESS is empty after 5 minutes, check the LBC logs:
kubectl logs -n kube-system deployment/aws-load-balancer-controller | tail -50
Common causes: incorrect alb.scheme casing, missing subnet tags, insufficient LBC IAM permissions.
Authentication
- Navigate to
https://<YOUR_DOMAIN>
- Click Sign in with Microsoft
- Authenticate with a user assigned a role in Azure portal
- Verify the user lands on the dashboard without error
If login fails with a redirect URI mismatch or authentication error, verify the redirect URI in Azure matches https://<YOUR_DOMAIN>/auth/entra_id/callback exactly.
Wharf Trace Collection
# Wharf pod should be Running
kubectl get pods -l app.kubernetes.io/component=wharf -n crewai
# Check Wharf logs
kubectl logs -l app.kubernetes.io/component=wharf --tail=30 -n crewai
Wharf connects to the wharf database on the same RDS host as the main application. If the pod is in CrashLoopBackOff, verify the wharf database exists and the crewai user has access.
Studio V2
# Both deployments should show Running
kubectl get deploy | grep -E "studio-assistant|studio-runner"
# Check Studio assistant logs
kubectl logs -l app=studio-assistant --tail=20 -n crewai
Factory Health Endpoint
TOKEN=$(kubectl get secret crewai-platform-secrets -n crewai \
-o jsonpath='{.data.FACTORY_DEBUG_TOKEN}' | base64 -d)
curl -H "X-Factory-Debug-Token: $TOKEN" \
https://<YOUR_DOMAIN>/health/debug
All components should report "status": "ok".