Skip to main content

Overview

This guide covers a full production deployment of CrewAI Platform on AWS EKS using:
  • Amazon RDS (PostgreSQL 16) for all databases including Wharf
  • Amazon S3 for object storage
  • Amazon ECR for crew container images
  • AWS ALB for ingress with ACM TLS termination
  • Microsoft Entra ID for SSO authentication
  • Wharf for OTLP trace and span collection
  • Studio V2 for the AI-powered crew builder (post-install)
This is a self-contained guide. All required configuration is included here — no cross-referencing other guides for values.

Prerequisites Checklist

Complete every item before running helm install.

AWS Infrastructure

ComponentRequirement
EKS clusterKubernetes 1.32.0+, AMD64 worker nodes
AWS Load Balancer ControllerInstalled and running in kube-system
Amazon RDSPostgreSQL 16, accessible from EKS worker nodes
Amazon S3 bucketCreated with versioning and encryption enabled
Amazon ECR repositoryEnds in /crewai-enterprise, tag mutability MUTABLE
ACM certificateIssued and validated for your domain
IAM rolePod Identity association for crewai-sa with S3 + ECR permissions
Public/private subnetsTagged for ALB discovery (kubernetes.io/role/elb=1)
CrewAI Platform only supports AMD64 (x86_64) worker nodes. ARM64 (Graviton) worker nodes are not supported.

Microsoft Entra ID

StepStatus
App registration created in Azure portal
Redirect URI configured: https://<YOUR_DOMAIN>/auth/entra_id/callback
Application (client) ID collected
Directory (tenant) ID collected
Client secret created and value copied
Admin consent granted for Microsoft Graph User.Read
App Roles created: member and factory-admin
Admin user assigned factory-admin App Role
The redirect URI must be configured in Azure before running helm install. Authentication will fail silently if it is missing or uses the wrong path.

Tools

  • kubectl connected to your EKS cluster
  • helm 3.10+
  • AWS CLI with credentials for your account

Infrastructure Setup

RDS: Pre-Create All Four Databases

Connect to your RDS instance as the postgres superuser and run:
-- Create the application user (if not already created)
CREATE USER crewai WITH PASSWORD '<DB_PASSWORD>';

-- Create all required databases
CREATE DATABASE crewai_plus_production;
CREATE DATABASE crewai_plus_cable_production;
CREATE DATABASE crewai_plus_oauth_production;   -- Chart default is "oauth_db" — this override is required
CREATE DATABASE wharf;                          -- Used by Wharf for trace/span storage

-- Grant full access to the crewai user
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_cable_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE crewai_plus_oauth_production TO crewai;
GRANT ALL PRIVILEGES ON DATABASE wharf TO crewai;
The chart default for POSTGRES_OAUTH_DB is oauth_db. You must override it to crewai_plus_oauth_production to match the database you created above. A mismatch causes the OAuth service to fail on startup.
When postgres.enabled: false, the Helm chart does not create databases automatically. All four databases must exist before helm install runs.

S3: Create Bucket

aws s3api create-bucket \
  --bucket <YOUR_S3_BUCKET> \
  --region <AWS_REGION>

aws s3api put-bucket-versioning \
  --bucket <YOUR_S3_BUCKET> \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption \
  --bucket <YOUR_S3_BUCKET> \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}
    }]
  }'

ECR: Create Repository

# Repository name MUST end in /crewai-enterprise
aws ecr create-repository \
  --repository-name <YOUR_ORG>/crewai-enterprise \
  --region <AWS_REGION> \
  --image-scanning-configuration scanOnPush=true

# Tags MUST be mutable — CrewAI overwrites tags on each build
aws ecr put-image-tag-mutability \
  --repository-name <YOUR_ORG>/crewai-enterprise \
  --image-tag-mutability MUTABLE \
  --region <AWS_REGION>
Do not set ECR image tag mutability to IMMUTABLE. CrewAI rewrites tags for crew versions — immutable tags cause build failures.

IAM: Pod Identity for S3 and ECR

Create a combined IAM policy for S3 and ECR access, then attach it to a role configured for Pod Identity. Combined IAM policy (crewai-platform-policy.json):
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "S3Access",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<YOUR_S3_BUCKET>",
        "arn:aws:s3:::<YOUR_S3_BUCKET>/*"
      ]
    },
    {
      "Sid": "ECRAuthToken",
      "Effect": "Allow",
      "Action": ["ecr:GetAuthorizationToken"],
      "Resource": "*"
    },
    {
      "Sid": "ECRPushPull",
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:PutImage",
        "ecr:InitiateLayerUpload",
        "ecr:UploadLayerPart",
        "ecr:CompleteLayerUpload"
      ],
      "Resource": "arn:aws:ecr:<AWS_REGION>:<ACCOUNT_ID>:repository/*"
    }
  ]
}
# Create policy
aws iam create-policy \
  --policy-name CrewAIPlatformAccess \
  --policy-document file://crewai-platform-policy.json

# Create IAM role for Pod Identity
aws iam create-role \
  --role-name CrewAIPodIdentityRole \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {"Service": "pods.eks.amazonaws.com"},
      "Action": ["sts:AssumeRole", "sts:TagSession"]
    }]
  }'

# Attach policy to role
aws iam attach-role-policy \
  --role-name CrewAIPodIdentityRole \
  --policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/CrewAIPlatformAccess

# Create Pod Identity association — service account name is crewai-sa (created by rbac.create: true)
aws eks create-pod-identity-association \
  --cluster-name <YOUR_CLUSTER> \
  --namespace crewai \
  --service-account crewai-sa \
  --role-arn arn:aws:iam::<ACCOUNT_ID>:role/CrewAIPodIdentityRole
With rbac.create: true (the chart default), the Helm chart creates a ServiceAccount named crewai-sa. The Pod Identity association must reference this exact name and the namespace you deploy into.

ACM: Request Certificate

aws acm request-certificate \
  --domain-name <YOUR_DOMAIN> \
  --validation-method DNS \
  --region <AWS_REGION>
# Save the certificate ARN — needed in values.yaml

Entra ID: Azure Portal Setup

Step 1: App Registration

  1. Go to portal.azure.com > Microsoft Entra ID > App registrations > New registration
  2. Name: CrewAI (or your preferred name)
  3. Supported account types: Accounts in this organizational directory only
  4. Redirect URI: Web platform — https://<YOUR_DOMAIN>/auth/entra_id/callback
  5. Click Register
The redirect URI path must be exactly /auth/entra_id/callback (lowercase, underscore). Configure this before running helm install — a missing or incorrect URI causes authentication to fail with no clear error.

Step 2: Collect Credentials

From the app overview page, copy:
  • Application (client) IDENTRA_ID_CLIENT_ID
  • Directory (tenant) IDENTRA_ID_TENANT_ID

Step 3: Create Client Secret

  1. Left sidebar > Manage > Certificates & secrets
  2. New client secret — enter a description, choose expiration
  3. Copy the Value immediately — it is not shown again → ENTRA_ID_CLIENT_SECRET
  1. Enterprise applications > select your app
  2. Security > Permissions > Grant admin consent
  3. Confirm consent for Microsoft Graph User.Read

Step 5: Create App Roles

  1. Back in App registrations > your app > Manage > App roles
  2. Create two roles:
Display NameValueAllowed Member Types
MembermemberUsers/Groups
Factory Adminfactory-adminUsers/Groups
Ensure “Do you want to enable this app role?” is checked for each.

Step 6: Assign Users

  1. Enterprise applications > your app > Manage > Properties
  2. Set Assignment required? to Yes, then Save
  3. Manage > Users and groups > Add user/group
    • Regular users: assign Member role
    • Admin users: assign Factory Admin role
For Entra ID deployments, admin access is granted exclusively through the Factory Admin App Role in Azure portal. Do NOT run factory:grant_admin — it writes to a database table that is not consulted for Entra ID users. The platform reads admin status from the JWT roles claim.

Complete values.yaml

Replace all <PLACEHOLDER> values before running helm install.
values.yaml
# ──────────────────────────────────────────────
# Disable bundled PostgreSQL — using Amazon RDS
# ──────────────────────────────────────────────
postgres:
  enabled: false

# ──────────────────────────────────────────────
# Disable bundled MinIO — using Amazon S3
# ──────────────────────────────────────────────
minio:
  enabled: false

# ──────────────────────────────────────────────
# Database: Amazon RDS
# ──────────────────────────────────────────────
envVars:
  DB_HOST: "<YOUR_RDS_ENDPOINT>"             # e.g. crewai.cluster-abc123.us-east-1.rds.amazonaws.com
  DB_PORT: "5432"
  DB_USER: "crewai"
  POSTGRES_DB: "crewai_plus_production"
  POSTGRES_CABLE_DB: "crewai_plus_cable_production"
  POSTGRES_OAUTH_DB: "crewai_plus_oauth_production"  # Chart default is "oauth_db" — must match what you created in RDS

  # ──────────────────────────────────────────
  # Object storage: Amazon S3
  # Using Pod Identity — no static credentials needed
  # ──────────────────────────────────────────
  STORAGE_SERVICE: "amazon"
  AWS_REGION: "<AWS_REGION>"                 # e.g. us-east-1
  AWS_BUCKET: "<YOUR_S3_BUCKET>"

  # ──────────────────────────────────────────
  # Crew image registry: Amazon ECR
  # Value is the registry prefix WITHOUT /crewai-enterprise
  # The platform appends /crewai-enterprise automatically
  # ──────────────────────────────────────────
  CREW_IMAGE_REGISTRY_OVERRIDE: "<ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<YOUR_ORG>"
  # Example: 123456789012.dkr.ecr.us-east-1.amazonaws.com/production
  # Do NOT include /crewai-enterprise here — the platform appends it

  # ──────────────────────────────────────────
  # Authentication: Microsoft Entra ID
  # CLIENT_ID and TENANT_ID are non-sensitive identifiers — they belong under envVars
  # CLIENT_SECRET is a credential — it belongs under secrets (below)
  # ──────────────────────────────────────────
  AUTH_PROVIDER: "entra_id"                  # exact value — lowercase with underscore
  ENTRA_ID_CLIENT_ID: "<APPLICATION_CLIENT_ID>"    # from App Registration overview
  ENTRA_ID_TENANT_ID: "<DIRECTORY_TENANT_ID>"      # from App Registration overview

  # ──────────────────────────────────────────
  # Application
  # ──────────────────────────────────────────
  APPLICATION_HOST: "<YOUR_DOMAIN>"          # e.g. crewai.company.com — no https:// prefix
  RAILS_LOG_LEVEL: "info"
  RAILS_MAX_THREADS: "5"
  DB_POOL: "5"

# ──────────────────────────────────────────────
# Secrets
# ──────────────────────────────────────────────
secrets:
  DB_PASSWORD: "<YOUR_DB_PASSWORD>"
  SECRET_KEY_BASE: "<64-char-hex-string>"    # openssl rand -hex 64
  ENTRA_ID_CLIENT_SECRET: "<CLIENT_SECRET_VALUE>"  # from Azure Certificates & Secrets

# ──────────────────────────────────────────────
# Wharf: OTLP trace and span collection
# wharf.enabled is the chart default (true) — included here for clarity
# Wharf shares the same DB host/port/user/password as the main app
# The wharf database must be pre-created in RDS (see SQL above)
# ──────────────────────────────────────────────
wharf:
  enabled: true
  postgres:
    wharfDatabase: "wharf"                   # must match the database you created in RDS

# ──────────────────────────────────────────────
# Web: application pods
# ──────────────────────────────────────────────
web:
  replicaCount: 2
  enableSslFromPuma: false                   # REQUIRED when using ALB — chart default is true
                                             # Omitting this causes 502 Bad Gateway on every request
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"

  ingress:
    enabled: true
    className: "alb"
    host: "<YOUR_DOMAIN>"                    # must match APPLICATION_HOST and ACM certificate

    annotations:
      alb.ingress.kubernetes.io/ssl-redirect: "443"
      alb.ingress.kubernetes.io/target-group-attributes: "idle_timeout.timeout_seconds=300"
      alb.ingress.kubernetes.io/healthcheck-path: "/up"

    alb:
      scheme: "internet-facing"              # lowercase only — capital letters silently fail to provision
      targetType: "ip"
      certificateArn: "arn:aws:acm:<AWS_REGION>:<ACCOUNT_ID>:certificate/<CERT_ID>"
      sslPolicy: "ELBSecurityPolicy-TLS13-1-2-2021-06"

# ──────────────────────────────────────────────
# Worker: background job processing
# ──────────────────────────────────────────────
worker:
  replicaCount: 2
  resources:
    requests:
      cpu: "1000m"
      memory: "6Gi"
    limits:
      cpu: "6"
      memory: "12Gi"

# ──────────────────────────────────────────────
# BuildKit: crew container image builds
# ──────────────────────────────────────────────
buildkit:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      cpu: "500m"
      memory: "2Gi"
    limits:
      cpu: "4"
      memory: "8Gi"

# ──────────────────────────────────────────────
# RBAC: auto-creates ServiceAccount crewai-sa
# Required for Pod Identity — name must match the association created in AWS
# ──────────────────────────────────────────────
rbac:
  create: true
web.enableSslFromPuma: false is required. The chart default is true. ALB terminates TLS and forwards plain HTTP to backend pods. With the default true, Puma expects HTTPS connections but receives HTTP from the ALB, causing 502 Bad Gateway errors and failed health checks on every request.
alb.scheme is case-sensitive. The AWS Load Balancer Controller only accepts lowercase values. Using "Internet-Facing" or "Internal" (capitalized) causes the ALB to silently fail to provision — the ingress will have no ADDRESS indefinitely.
CREW_IMAGE_REGISTRY_OVERRIDE must not include /crewai-enterprise. The platform appends this suffix automatically. If your ECR repository is 123456789012.dkr.ecr.us-east-1.amazonaws.com/production/crewai-enterprise, set only 123456789012.dkr.ecr.us-east-1.amazonaws.com/production. Including the suffix causes push failures to a double-suffixed path.

Install

helm install crewai-platform \
  oci://registry.crewai.com/crewai/stable/crewai-platform \
  --values values.yaml \
  --namespace crewai \
  --create-namespace
Wait for all pods to reach Running:
kubectl get pods -n crewai -w

Post-Install

Required Initialization

These commands must be completed before any user can log in. Run them in the order shown.
# 1. Initialize the internal CrewAI organization
kubectl exec -it deploy/crewai-web -- bin/rails studio:install_internal_organization

# 2. Set up default permission roles
kubectl exec -it deploy/crewai-web -- bin/rails factory:setup_permissions_defaults

# 3. Add your admin user as owner of org 2 (the first customer-facing org)
#    Replace admin@company.com with the email address assigned the factory-admin App Role in Azure
kubectl exec -it deploy/crewai-web -- bin/rails 'factory:add_owner[2,admin@company.com]'
Do NOT run factory:grant_admin for Entra ID deployments. Admin panel access is controlled by the factory-admin App Role in Azure portal. The factory:grant_admin command writes to a database table that Entra ID authentication does not consult — it has no effect on Entra ID users and will not grant admin access.
For Entra ID, the user record is created in the database automatically on first login. The factory:add_owner command above can be run before or after the user’s first login.

Studio V2 Setup

Studio V2 cannot be configured in values.yaml. Adding studioV2.enabled or STUDIO_V2_ENABLED has no effect — Helm silently ignores unknown keys. Setup requires the platform to be fully running and accessible. Step 1: Create the LLM Connection (UI)
  1. Log in to the CrewAI web UI as an admin
  2. Navigate to Settings → LLM Connections
  3. Click New Connection
  4. Set the name to exactly studio-v2 (lowercase, no spaces)
  5. Select your LLM provider, enter the model name and API key
  6. Click Save
The connection name must be exactly studio-v2. The install commands in Step 3 look up this name specifically — a different name or capitalization causes them to fail silently.
Step 2: Set as Default Connection (UI)
  1. Navigate to Settings → Crew Studio
  2. Under Default Connection, select studio-v2
  3. Click Save
Step 3: Run Install Commands (kubectl) Run these commands in order. Each must complete successfully before running the next.
# 1. Install the Studio agent
#    This command FAILS if the studio-v2 LLM Connection does not exist yet
kubectl exec -it deploy/crewai-web -- bin/rails studio:agent:install

# 2. Sync and index tools
kubectl exec -it deploy/crewai-web -- \
  bin/rails studio:tools:sync_crewai_tools \
       studio:tools:sync_enterprise_tools

# 3. Install the Studio runner
kubectl exec -it deploy/crewai-web -- bin/rails studio:runner:install
studio:agent:install will fail if the studio-v2 LLM Connection does not already exist in the UI. Complete Steps 1 and 2 before running any of these commands.

Verify

Platform Health

# All pods should be Running
kubectl get pods -n crewai

# Check web pod logs for errors
kubectl logs -l app.kubernetes.io/component=web --tail=50 -n crewai

# Check worker pod logs
kubectl logs -l app.kubernetes.io/component=worker --tail=50 -n crewai

ALB Ingress

# ADDRESS should show an ALB DNS name within 2–3 minutes
kubectl get ingress -n crewai
If ADDRESS is empty after 5 minutes, check the LBC logs:
kubectl logs -n kube-system deployment/aws-load-balancer-controller | tail -50
Common causes: incorrect alb.scheme casing, missing subnet tags, insufficient LBC IAM permissions.

Authentication

  1. Navigate to https://<YOUR_DOMAIN>
  2. Click Sign in with Microsoft
  3. Authenticate with a user assigned a role in Azure portal
  4. Verify the user lands on the dashboard without error
If login fails with a redirect URI mismatch or authentication error, verify the redirect URI in Azure matches https://<YOUR_DOMAIN>/auth/entra_id/callback exactly.

Wharf Trace Collection

# Wharf pod should be Running
kubectl get pods -l app.kubernetes.io/component=wharf -n crewai

# Check Wharf logs
kubectl logs -l app.kubernetes.io/component=wharf --tail=30 -n crewai
Wharf connects to the wharf database on the same RDS host as the main application. If the pod is in CrashLoopBackOff, verify the wharf database exists and the crewai user has access.

Studio V2

# Both deployments should show Running
kubectl get deploy | grep -E "studio-assistant|studio-runner"

# Check Studio assistant logs
kubectl logs -l app=studio-assistant --tail=20 -n crewai

Factory Health Endpoint

TOKEN=$(kubectl get secret crewai-platform-secrets -n crewai \
  -o jsonpath='{.data.FACTORY_DEBUG_TOKEN}' | base64 -d)

curl -H "X-Factory-Debug-Token: $TOKEN" \
  https://<YOUR_DOMAIN>/health/debug
All components should report "status": "ok".