Skip to main content

Configuration

All scale settings are managed from Settings → Scale in the Synapse UI. Changes are persisted to the settings database and take effect after clicking Save Scale Config. Changes to the Redis URL or Postgres URL require a server restart to reconnect.

Every setting can also be provided as an environment variable passed to the worker image (see Environment variables reference). Environment variables take precedence over values stored in the UI settings when both are present.


Redis connection

Redis URL

The connection string for your Redis instance.

FormatExample
Single noderedis://hostname:6379/0
With passwordredis://:password@hostname:6379/0
TLS (rediss)rediss://hostname:6380/0
Redis Clusterredis+cluster://node1:6379,node2:6379,node3:6379

The database index at the end (/0) is optional and defaults to 0. Use a dedicated database index if you share a Redis instance across multiple applications to avoid key collisions.

Redis Cluster mode is auto-detected from the redis+cluster:// URL scheme — no separate toggle is needed. All queue shards are distributed across cluster nodes using Redis hash slots.

note

For managed Redis services, use the connection string provided by your provider. Most managed services (Upstash, Redis Cloud, ElastiCache) use rediss:// (with double-s) for TLS connections.

Enable Scale Mode

When enabled, all V2 API calls (/api/v2/...) route jobs through the Redis ARQ queue rather than executing inline. The API returns 202 Accepted immediately with a run_id.

When disabled, the main instance falls back to inline synchronous execution and the V2 API behaves like V1 — blocking until the run completes. In-flight jobs on workers continue until they finish — disabling scale mode does not cancel running jobs.

Queue shards

DefaultRange
1164

The number of ARQ queue shards. With a single shard, all jobs go into one Redis sorted set (synapse:orchestrations:default). With multiple shards, jobs are distributed across synapse:orchestrations:0, synapse:orchestrations:1, etc.

Increase queue shards when:

  • Running on Redis Cluster — multiple shards allow jobs to be spread across cluster nodes, each owning different hash slots
  • Queue throughput exceeds roughly 10,000 enqueues/second on a single Redis node
  • You have enabled tenant isolation and want dedicated shards per tenant

Each worker can be pinned to a specific shard via the WORKER_QUEUE_SHARD environment variable. This allows you to dedicate certain worker capacity to specific tenants or job priorities.

Worker concurrency

DefaultRange
101100

The maximum number of jobs a single worker process executes concurrently. This setting is stored in Postgres and read by workers at startup.

Workers use asyncio for concurrency — concurrent jobs run as coroutines, not threads. This means:

  • For I/O-heavy workloads (LLM API calls, web searches, database queries), 1020 concurrency per worker is a reasonable starting point. The worker spends most of its time waiting on external services, so concurrency translates directly to throughput.
  • For CPU-heavy workloads (local model inference, large data processing), keep concurrency lower (35) and run more worker instances in parallel.

This setting can be overridden per worker instance using the WORKER_CONCURRENCY environment variable, allowing different worker pools to have different concurrency levels.


Postgres sync

Workers do not read from local JSON files. They read all orchestration, agent, tool, and MCP server definitions from Postgres. The main instance is the single source of truth — you must sync local definitions to Postgres before workers can access them.

Postgres URL

Any standard PostgreSQL connection URL is accepted.

FormatExample
Standardpostgresql://user:password@host:5432/dbname
SSL requiredpostgresql://user:password@host:5432/dbname?sslmode=require
Via PgBouncerpostgresql://user:password@pgbouncer-host:6432/dbname

Synapse normalises the URL to the asyncpg dialect internally — you do not need to include +asyncpg in the URL.

note

This database is separate from any application database you may use for memory storage or RAG. Use a dedicated database (e.g. named synapse) for scale mode infrastructure.

PgBouncer mode

When enabled, Synapse uses NullPool (no persistent connection pool), opening and closing a connection for each query. This is required when the Postgres URL points to PgBouncer running in transaction pooling mode.

Enable PgBouncer mode when:

  • Your Postgres URL points to a PgBouncer proxy
  • PgBouncer is configured with pool_mode = transaction

Do not enable this when connecting directly to Postgres — it disables connection pooling and adds latency on every query.

Auto-sync on change

When enabled, Synapse automatically pushes the updated definition to Postgres whenever you save an orchestration, agent, tool, or MCP server from the UI.

When disabled, you must click Sync Now manually after every change before workers will pick up the updated definitions.

note

Workers read definitions from Postgres at the start of each job, not at worker startup. A worker that is already running when you sync will pick up new definitions on its next job.

Sync Now

Clicking Sync Now triggers a full sync of all local definitions to Postgres. The sync operation:

  1. Reads all orchestration, agent, tool, and MCP server configurations from local JSON files on the main instance
  2. Reads LLM API keys and secret settings from the settings database
  3. Upserts all records into the corresponding Postgres tables using the definition ID as the unique key
  4. Reports a total_synced count and any errors encountered

Always run Sync Now after:

  • Adding or modifying an orchestration, agent, tool, or MCP server (if auto-sync is disabled)
  • Changing LLM API keys in Settings → LLM Providers
  • First enabling scale mode on a fresh deployment
  • Restoring definitions from a backup or import

S3 storage

When an S3 bucket is configured, vault files (documents, attachments) and execution logs are stored in S3 and shared across all worker instances. Without S3, each worker reads and writes these files to its own local disk. This is fine for single-worker deployments but breaks for multi-worker deployments — if Worker A writes a vault file during Step 1, Worker B cannot see it when it picks up Step 2 in a resumed run.

S3 bucket

The name of the S3 bucket to use. The bucket must already exist and the credentials provided must have the following permissions:

  • s3:GetObject
  • s3:PutObject
  • s3:ListBucket

Region

Default
us-east-1

The AWS region where the bucket is located. Must match the bucket's actual region.

Key prefix

Default
synapse

All objects are stored under this prefix. For example, with the default prefix, vault files are stored at synapse/vault/... and run logs at synapse/logs/.... Useful when sharing a bucket across multiple Synapse deployments.

Access key ID and secret access key

IAM credentials for S3 access. Leave both fields blank to use the instance's IAM role instead — this covers EC2 instance profiles, ECS task roles, EKS IRSA (IAM Roles for Service Accounts), and any other AWS credential chain mechanism.

Endpoint URL

Leave blank for AWS S3. Set this field to use an S3-compatible object store:

ProviderEndpoint format
MinIO (self-hosted)http://minio-host:9000
Cloudflare R2https://{account_id}.r2.cloudflarestorage.com
Backblaze B2https://s3.{region}.backblazeb2.com
DigitalOcean Spaceshttps://{region}.digitaloceanspaces.com

Observability

OTLP endpoint

The gRPC endpoint of an OpenTelemetry-compatible collector. When configured, every orchestration run generates a distributed trace that propagates through:

  • API server (request receipt, job enqueue)
  • Redis (queue wait time)
  • Worker (step execution, tool invocations)
  • Postgres (state reads and writes)

Each step in the orchestration becomes a span within the trace, making it straightforward to identify slow steps and debug failures across multiple worker instances.

http://jaeger:4317 # local Jaeger via gRPC
http://otel-collector:4317 # OpenTelemetry Collector
https://api.honeycomb.io # Honeycomb Cloud

Leave blank to disable tracing. Running docker compose --profile enterprise up -d starts a local Jaeger instance with the UI at http://localhost:16686 and the OTLP endpoint at http://jaeger:4317.

Metrics token

A bearer token that Prometheus uses to authenticate when scraping GET /api/v2/metrics. Set this to a random secret string (e.g. openssl rand -hex 32) and configure your Prometheus scrape job to send it:

scrape_configs:
- job_name: synapse-api
bearer_token: your-metrics-token
static_configs:
- targets: ['synapse-api:8765']

Leave blank for unauthenticated access to the metrics endpoint (not recommended in production).

Max global queue depth

Default
1,000,000

The maximum total number of queued plus active jobs across all tenants. When this limit is exceeded, new job submissions return 503 Service Unavailable. This acts as a backpressure valve to prevent Redis from being overloaded during traffic spikes.

Rate limit per tenant

Default
1,000 req/s

Maximum API requests per second, per tenant, for V2 endpoints. Requests above this limit return 429 Too Many Requests. This prevents a single tenant from consuming all available capacity and starving other tenants.


Tenants

Tenants are named quota boundaries. Each tenant gets its own rate limits and maximum queue depths. With tenant isolation enabled, each tenant also gets its own Redis queue shard, preventing a backlogged tenant from delaying jobs from other tenants.

What a tenant is

A tenant is any string you pass as the tenant_id field in a V2 API request body. You can use it to represent:

Use caseExample tenant_id
SaaS customercustomer-acme-corp
Internal teamteam-marketing
Environmentstaging, production
Individual useruser-abc123
Any arbitrary groupingAnything you want to isolate or rate-limit

If no tenant_id is provided, the request uses the value of DEFAULT_TENANT_ID (default: "default").

Tenant isolation

EnabledPer-tenant Redis queue shard — a backlogged tenant cannot delay jobs from other tenants
Disabled (default)All tenants share the same queue shards — jobs are processed in FIFO order

Enable ENABLE_TENANT_ISOLATION=1 when your deployment serves multiple customers and you need to guarantee that one customer's heavy workload cannot impact another's job latency.

Adding a tenant

Create a named tenant from the Tenants section of the Scale settings page:

FieldDescription
tenant_idUnique identifier — used in API requests as "tenant_id": "..."
nameHuman-readable label (optional)
max_queued_runsMaximum queued jobs for this tenant. Submissions above this return 429
max_concurrent_runsMaximum simultaneously-running jobs for this tenant

If a tenant is not explicitly created, it uses the global defaults (MAX_GLOBAL_QUEUE_DEPTH and RATE_LIMIT_PER_TENANT_RPS).


Environment variables reference

All environment variables accepted by the synapseorchai/synapse-ai-worker image. Pass them via:

  • docker run -e VAR=value
  • docker run --env-file .env
  • Kubernetes Secret or ConfigMap referenced in the pod spec
  • Docker Compose environment: block

Required

VariableDescription
REDIS_URLRedis connection URL. Examples: redis://host:6379/0 (single node), redis+cluster://h1:6379,h2:6379 (cluster)
SCALE_POSTGRES_URLPostgres connection URL: postgresql://user:pass@host:5432/dbname

Worker tuning

VariableDefaultDescription
WORKER_CONCURRENCY10Maximum concurrent jobs per worker process
WORKER_JOB_TIMEOUT3600Seconds before a job is forcibly cancelled and marked failed
WORKER_MAX_RETRIES3Number of retry attempts before moving a job to the dead-letter queue
WORKER_HEALTH_PORT9000Port for the HTTP health endpoint (GET /health)
WORKER_QUEUE_SHARDdefaultQueue shard this worker consumes from. Use default for single-shard, or a numeric index (0, 1, …) for multi-shard deployments
NUM_QUEUE_SHARDS1Total number of queue shards — must match the value set in the main instance UI

Multi-tenancy

VariableDefaultDescription
DEFAULT_TENANT_IDdefaultTenant ID assigned to API requests that do not supply a tenant_id field
ENABLE_TENANT_ISOLATION0Set to 1 to enable per-tenant Redis queue sharding

Database

VariableDefaultDescription
PGBOUNCER_MODE0Set to 1 when SCALE_POSTGRES_URL points to PgBouncer in transaction pooling mode

S3 storage

VariableDefaultDescription
S3_BUCKET(empty)S3 bucket name. Leave blank to disable S3 and use local disk
S3_REGIONus-east-1AWS region of the bucket
S3_PREFIXsynapseKey prefix — all objects are stored under this path
S3_ACCESS_KEY_ID(empty)IAM access key. Leave blank to use IAM role or instance profile
S3_SECRET_ACCESS_KEY(empty)IAM secret key
S3_ENDPOINT_URL(empty)Custom endpoint URL for S3-compatible stores (MinIO, Cloudflare R2, etc.)

Observability

VariableDefaultDescription
OTLP_ENDPOINT(empty)OpenTelemetry gRPC endpoint (e.g. http://jaeger:4317). Leave blank to disable tracing
METRICS_TOKEN(empty)Bearer token for GET /metrics on port 9000. Leave blank for unauthenticated access
K8S_MODE0Set to 1 when running on Kubernetes — adjusts health probe behaviour and shutdown sequencing

Retention and limits

VariableDefaultDescription
PUBSUB_EVENT_TTL3600Seconds to retain step events in Redis Streams. After this, events expire and cannot be replayed by reconnecting clients
RUNS_RETENTION_DAYS90Days to keep completed run records in Postgres. A cleanup job removes older records automatically
MAX_GLOBAL_QUEUE_DEPTH1000000Maximum total queued + active jobs. Submissions above this limit return 503
RATE_LIMIT_PER_TENANT_RPS1000Maximum V2 API requests per second per tenant. Exceeded requests return 429